datopian / datahub

🌀 Rapidly build rich data portals using a modern frontend framework
https://datahub.io/opensource
MIT License
2.19k stars 327 forks source link

Title extraction should only take into account h1 headings #1219

Closed olayway closed 1 week ago

olayway commented 1 month ago

DataHub Cloud tries to display page titles instead in the sidebar. If title is not provided, it tries to extract it. However, it should only try to extract the first h1 level heading and only if no content is found before it. Currently it also extracts h2s as well if h1 hasn't been found, which shouldn't happen.

File names: image

Resulting side bar links: image

But none of the files have h1 or title frontmatter field set, so they should actually default to file names.

rufuspollock commented 1 month ago

@olayway it's more than that IMO: it should only extract a h1 heading that is at the very start of the document ie. after i strip frontmatter and whitespace # heading is the first set of characters.

Daniellappv commented 1 month ago

this will be prioritized and someone from the team (not Ola) will pick it up @rufuspollock

olayway commented 3 weeks ago

@gradedSystem how's it going with this issue?

olayway commented 1 week ago

FIXED

Note: currently you need to re-create your site in order for this to take effect