Open tsmethurst opened 5 months ago
OK, this does affect normal tooting as well: test & test & test
outside of accent-gravis code blocks becomes test & test & test
It has an even worse impact: the HTML for that (in the database) is <p>test & test & test</p>
, so the &
from Markdown (which should have become &amp;
) actually became just &
.
Tbh, including &
in your text is not really a "normal" toot, but yes. We'll fix this after 0.14.0 is out probably. Unless I'm misremembering, it's not really a trivial quick fix because of the order in which we need to handle sanitization, minimization, etc, so it will require a bit of time and focus to do it properly.
tobi dixit:
Tbh, including
&
in your text is not really a "normal" toot
If I want to write exactly “&” then yes, kinda ;-)
But conversion to ‘&’ is definitely wrong; if anything, it should have become ‘&’ though I would have expected ‘&’ but someone should probably check what CommonMark says.
We'll fix this after 0.14.0 is out probably.
OK.
I checked if I could trigger JavaScript from it, but I think the sanitiser runs afterwards and thus catches it, so it’s “only” an unterminated entity (I have seen browsers cut off at the ampersand until the next semicolon somewhere in the file or even until EOF if there was none, but not sure if there are any Fedi clients that would do that).
bye, //mirabilos
think the sanitiser runs afterwards and thus catches it
It does yeah.
As the title says, when we parse something from markdown we don't end up html-escaping any ampersands to
&
in resulting HTML.We should try to squeeze this in to our markdown-to-html parsing and sanitization flow somewhere, without totally mangling any existing escapes, as not escaping the ampersand may cause issues with some HTML parsers (haven't seen any yet but you know how it is).
Credit to @mirabilos for noticing this: https://github.com/superseriousbusiness/gotosocial/issues/2610#issuecomment-1930525004