Closed ksco closed 1 year ago
The title
HTML node is inserted into html body instead of the <head>
block. However, this is not a behavior allowed by the standard (refer to mdn).
Besides, introducing hastscript
seems a little bit heavyweight, it could be better to avoid introducing a new dependency. Maybe just reusing the unify markdown parser, or even just performing a regex match.
Regex matching seems straightforward, but we might still want to extract plain text from the h1
tag, there doesn't seem to have DOMParser
or createElement
in a Node.js env.
The problem of regex solution occurs when the <h1>
tag contains nested markdown element (bold, italics and etc). We should extract the text from the <h1>
node, instead of extracting the HTML. I am looking for ao API in unify ecosystem to archive this.
Yep that's exactly what I mentioned above, we could do it on the client side though.
It is still more preferred to do it on server side. Since search engine, social media preview and many other automated services depends on metadata in <head>
returned by the server.
I found that Cloudflare provides a HTMLRewriter API to provide server side HTML handling. I will try this later.
I implemented a new version (09e0482f42f8ed41f23260d8caaf271221df3c75) with directly operating on mdast and mdast-util-to-string. It extracts both the title and the description and seems working fine. Thanks for your suggestion and implementation and welcome for any further issues and suggestions.
Use the first h1 text found in markdown as the
<title>
, use 'Untitled' if not found. I'm not familiar withunified
, there may be a better way to implement this feature.