fix: breakage in transform blockquotes function removes some html tags

oleeskild / digitalgarden

290 stars 160 forks source link

fix: breakage in transform blockquotes function removes some html tags #149

Open bayang opened 1 year ago

bayang commented 1 year ago

Initially when rendering some pages I noticed there was no styling applied, or only partially applied.

By examining the html I noticed some tags from the nunjuck template file were missing. For example : the main tag had been stripped off but not the content of my note, and since come of the css classes are applied on the main element the page had only partial styling.

The pages looked like the capture on this issue : https://github.com/oleeskild/obsidian-digital-garden/issues/294

Actually after doing some research, I found that the node html parser was silently failing to parse some content and that the resulting html was missing some tags.

To fix the function I needed to replace the parser, so I took the first one from npm that didn't fail on the generated html from my markdown note. I took cheerio, I don't know if that is suitable for you.

The fix has been pushed on my dev garden : https://github.com/bayang/garden-dev

oleeskild commented 1 year ago

Thanks for investigating this, and taking the time to create a PR 🙌

The reason the template is using node-html-parser is because it is significantly faster and "lighter" than cheerio. (See https://www.npmjs.com/package/node-html-parser). That said, slow performance but correct output is better than fast performance and incorrect output. But maybe we can find a way to fix this without using cheerio. Can you provide me an example of a note that gets broken by pasting it here?

bayang commented 1 year ago

Hmm, I think I used the notes in my dev garden to reproduce

https://github.com/bayang/garden-dev/tree/main/src/site/notes,

but I can't remember if the notes content is still "wrong". Try with the test.md note at first, I think this one will fail. To advance on my debugging I first called the valid method of the node parser (visible in the npm link ypu pasted above) This method call returned false for my notes, so I guess the parser considers the html generated by the markdown renderer to be invalid.

I'll try to find more infos if I have some notes. If I recall correctly the main tag of the html is swallowed by the parser. You can inspect this page html to see that the main tag is not present : https://notes.thatother.dev/physics/standard-model/