SharzyL / pastebin-worker

Pastebin on Cloudflare worker, with friendly CLI usage and rich features
https://shz.al
MIT License
732 stars 223 forks source link

feat: show page title in markdown page #18

Closed ksco closed 1 year ago

ksco commented 1 year ago

Use the first h1 text found in markdown as the <title>, use 'Untitled' if not found. I'm not familiar with unified, there may be a better way to implement this feature.

SharzyL commented 1 year ago

The title HTML node is inserted into html body instead of the <head> block. However, this is not a behavior allowed by the standard (refer to mdn).

Besides, introducing hastscript seems a little bit heavyweight, it could be better to avoid introducing a new dependency. Maybe just reusing the unify markdown parser, or even just performing a regex match.

ksco commented 1 year ago

Regex matching seems straightforward, but we might still want to extract plain text from the h1 tag, there doesn't seem to have DOMParser or createElement in a Node.js env.

SharzyL commented 1 year ago

The problem of regex solution occurs when the <h1> tag contains nested markdown element (bold, italics and etc). We should extract the text from the <h1> node, instead of extracting the HTML. I am looking for ao API in unify ecosystem to archive this.

ksco commented 1 year ago

Yep that's exactly what I mentioned above, we could do it on the client side though.

SharzyL commented 1 year ago

It is still more preferred to do it on server side. Since search engine, social media preview and many other automated services depends on metadata in <head> returned by the server.

I found that Cloudflare provides a HTMLRewriter API to provide server side HTML handling. I will try this later.

SharzyL commented 1 year ago

I implemented a new version (09e0482f42f8ed41f23260d8caaf271221df3c75) with directly operating on mdast and mdast-util-to-string. It extracts both the title and the description and seems working fine. Thanks for your suggestion and implementation and welcome for any further issues and suggestions.