apify / apify-docs

This project is the home of Apify's documentation.
https://docs.apify.com
Apache License 2.0
28 stars 73 forks source link

Use actual H1 page title for browser page title and OpenGraph title #754

Open mtrunkat opened 11 months ago

mtrunkat commented 11 months ago

Currently, we use the menu title from metadata, but that is usually short to fit the menu and misses the contact you have in the menu. So on its own it's not enough information for the Google or when sharing the page.

CleanShot 2023-11-09 at 09 31 16@2x

Ideally, we should parse it from the page markdown, and if this is not possible, we will move custom <h1 /> to a new metadata property.

More in: https://apifier.slack.com/archives/CQ96RHG2U/p1699475295877139

mtrunkat commented 11 months ago

@B4nan could you handle it with @barjin ? You know the most about this.

B4nan commented 11 months ago

Sure, I'll take a look.

Just FYI the title is not "menu title", it's a regular title, that's why it's used in the mets tags. If there would be no H1 in the content, this would be rendered as one, we just override it with the content. There is a separate siderbar_label frontmatter option for that, I guess it's just about using it in the frontmatter. The fix will be most probably per page, I can try to quickly review things based on the frontmatter, but it's not like we use "menu titles for meta descriptions".

barjin commented 11 months ago

Yeah, if we could also remove the rogue H1 headings in the article bodies, that would be great :)

From what I remember, they (are|were) getting picked up by the Algolia Crawler, which messed up the search results a bit (the articles are segmented to searchable records by headings, so this double-heading situation led to one result with no body).

B4nan commented 11 months ago

So you think we should prefer the title in frontmatter instead of keeping the one in the content? My plan was to rather remove the title from frontmatter and replace it with sidebar_label if it differs from the H1 in the content.

barjin commented 11 months ago

Imo keeping the title in the frontmatter is a bit nicer, but it apparently shouldn't matter:

obrazek

Go with your instinct then, from the docs it seems that Docusaurus can even infer the title from the Markdown and vice versa. From what I saw, I've probably fixed the search issue :)

B4nan commented 11 months ago

Go with your instinct then, from the docs it seems that Docusaurus can even infer the title from the Markdown and vice versa.

Yeah, that's what I thought as well, if it's not explicit, it should be inferred from the content, and that will be probably easier for docs contributors.

jancurn commented 11 months ago

Hey, so basically we'll be able to override the menu text using the siderbar_label option, and the title will be kept as is (used for page title, H1 if there is none, and OG images) ? IMO that would be a good solution

B4nan commented 11 months ago

Yes, the fix will be changing the title option to sidebar_label where the value differs, we don't need the title if there is a top-level H1 in the content, it will be respected automatically. Will write a short script for this and clean it up everywhere.

jancurn commented 11 months ago

Okay, but if title is present, it will be used for page title and OG description? I think it would be good to still have it as a way to override the top-level H1

B4nan commented 11 months ago

Yes, it's the same thing, if you don't provide both, the other is always inferred from the one with value. If the value differs, the title is (most probably) used for the metadata.

So you'd like to keep both the title and H1 in the content even if the content is exactly the same?

jancurn commented 11 months ago

I think it's more transparent to keep the title in content, rather than the H1s, so I'd clean the H1 if they are the same. It's clearer that's what's used for page title, OG, etc.

B4nan commented 11 months ago

Just to be clear, we want this

---
---

# Foo

this is page about foo

and not this

---
title: 'Foo'
---

this is page about foo
jancurn commented 11 months ago

Honestly I'd prefer:

---
title: 'Foo'
---

this is page about foo

But that's just technicallity, so your call.

What's important is that title will keep working, and if both H1 and title is used, then title goes to page title and OG, and H1 is shown in the document