Closed seaerchin closed 1 year ago
@alexanderleegs tagging you separately in case there is anything that isn't back-compat
@alexanderleegs tagging you separately in case there is anything that isn't back-compat
I think this is fine, we previously didn't do any html encoding for the special pages either as far as i know?
creating new pages creates this commented out thing at the moment!
does this actually impact editing experience? this is injected due to sanitize
encountering an empty body and injecting it into a document. if it's concerning, we could always avoid sanitization if it's an empty string
creating new pages creates this commented out thing at the moment!
does this actually impact editing experience? this is injected due to
sanitize
encountering an empty body and injecting it into a document. if it's concerning, we could always avoid sanitization if it's an empty string
Could we put in the check for empty string then? As a user creating a new page, having something you didn't input immediately show up in your editing view probably isn't ideal
Problem
Previously, there was inconsistent behaviour caused by sanitization on the backend. This is because of dompurify's sanitization config, where it will automatically html encode certain special characters if it detects that there is a html tag.
This process affects not just content within the tag, but the string as a whole. For example,
will have both ampersands encoded even though the first one is outside of the b-tag.
Closes LS-81
Solution
In order to make sure that sanitization takes place properly, this PR establishes an invariant, as follows: frontmatter content is never html encoded.
This is chosen over html encoding all content in our frontmatter due to the existence of the 3 special pages (homepage/nav/contact-us), where users are able to input html (:sadge:)
In order to preserve this property, a few rules have to be followed (done alr at present)
sanitize
directly for markdown files but through a given interface (<convert|retrieve>DataFromMarkdown
).unescape
after sanitizing frontmatter.This allows us
Testing
Notes This change might be destructive - existing pages w html encoded properties in their frontmatter will have them unescaped. I'm not entirely sure how this impacts things like
permalink
etc but it's also possible to guarantee only for thetitle
property