Closed trysdyn closed 1 year ago
This is done here: https://github.com/superseriousbusiness/gotosocial/blob/b6fbdc66c1ce1ec61ebfb6fcc0351ea627a1d288/internal/web/opengraph.go#L134-L144
My guess is the "safe to use as a template.HTMLAddr" is why this happens. I wonder if it might be better to &-encode these instead?
Mmm could be better yeah... Thanks for looking at this :)
Haha, actually, what if instead of doing it by hand we call html.EscapeString
, the counter-part of html.UnescapeString
. It seems to do what we want:
var htmlEscaper = strings.NewReplacer(
`&`, "&",
`'`, "'", // "'" is shorter than "'" and apos was not in HTML until HTML5.
`<`, "<",
`>`, ">",
`"`, """, // """ is shorter than """.
)
// EscapeString escapes special characters like "<" to become "<". It
// escapes only five such characters: <, >, &, ' and ".
// UnescapeString(EscapeString(s)) == s always holds, but the converse isn't
// always true.
func EscapeString(s string) string {
return htmlEscaper.Replace(s)
}
It doesn't look like \
or \n
are considered a problem for HTML, although we could encode the \
as \
which would also take care of the \n
case.
Aha okay, so we could unescape it, sanitize it, and escape it again?
Describe the bug with a clear and concise description of what the bug is. Please include screenshots of any visual issues.
When generating the og:description OpenGraph tag for a plaintext post, GtS strips escape
(\)
characters, which can result in the OpenGraph embed presenting a shell command or code that is not what was originally authored in the post.What's your GoToSocial Version?
v0.7.1 tag, source build
Browser version
n/a
What happened?
I authored a post. In this post I include the following shell command as a recommendation for people to try.
echo '\e]8;;http://example.com\e\This is a link\e]8;;\e'
Someone then linked my post in Discord, which presented the unfurl as generated by the OpenGraph tags. There, the command was mangled to this:
echo 'e]8;;http://example.comeThis is a linke]8;;e
Note the missing backslashes. This significantly changes the impact of the command :)
I've verified the slashes are stripped on the og:description tag, so it's not Discord's doing.
What you expected to happen?
The escape slashes to be left in the post text in og:description.
How to reproduce it?
Post anything with the Public visibility level that contains any number of \ characters, then inspect the post in the frontend and view source to see the \s were stripped in og:description.
Anything else we need to know?
No response