Closed babababag closed 2 years ago
Wait, why it closed this in here?!?
Wait, why it closed this in here?!?
Oops I didn't think adding "fix" in my commit from another repo would close the issue: https://github.com/unixfox/invidious-custom/commit/385d07ce3c96a388de32190469329e8a629f8899
That is not supposed to happen D:
@SamantazFox @unixfox
This is still happening, at least in Firefox.
It's still happening because this content-disposition header is still urlencoding strings with +
instead of %20
:
https://github.com/iv-org/invidious/blob/170e75499816af460d76d05ce4d440d37a01e0fb/src/invidious/routes/video_playback.cr#L161-L161
There are two directives allowed in the content-disposition header, filename
and filename*
. The filename*
directive only supports percent-encoded characters and some common literal ascii characters, including the plus character, so +
means a literal plus (this is the issue). The filename
directive only officially supports a subset of these characters, but browsers accept percent-encoded characters here too. In both cases replacing all the +
characters with %20
in this content-disposition header would fix the problem.
The merged pull request #2684 introduces a bug where if a video title has an ampersand in it, the downloaded filename will be truncated to everything before the ampersand.
This is what the PR changed. The title is decoded here to the true string representation of the title: https://github.com/iv-org/invidious/blob/170e75499816af460d76d05ce4d440d37a01e0fb/src/invidious/routes/video_playback.cr#L243-L243
But it's still inserted unescaped into the location header. Previously this was "fine" because as the title was never decoded, inserting it as-is meant that it was already urlencoded as the location header would expect. https://github.com/iv-org/invidious/blob/170e75499816af460d76d05ce4d440d37a01e0fb/src/invidious/routes/video_playback.cr#L276-L276 It was "fine" (but not really) because if you gave it a string that wasn't already urlencoded, it would still have been inserted into the url unescaped so you would've been able to insert arbitrary parameters into the url. I didn't report this because I was pretty sure the impact was limited to your own browser, but still...
Example location header for this redirect:
Location: /videoplayback?expire=1644344668&ei=_GACYobuBePshwbSyKDwBg&ip=135.148.149.204&id=o-ABCWOwFCpELJ9T9-9yBiLVxbwSYnmHNSOHwHPdEPMrQ9&itag=22&source=youtube&requiressl=yes&mh=ks&mm=31%2C29&mn=sn-tt1eln7e%2Csn-tt1e7n7l&ms=au%2Crdu&mv=m&mvi=6&pl=25&initcwndbps=335000&vprv=1&mime=video%2Fmp4&ratebypass=yes&dur=44.744&lmt=1614897234880460&mt=1644322873&fvip=6&fexp=24001373%2C24007246&c=ANDROID&txp=6216222&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRQIgCXgi1Q7Y0gX78xqp0NFoxjvBKY5yU-cav6kKiHaiXR0CIQDzFo95FI8OHtuzm3IFn3eIDgfspyIEYJ6PAanH-r0woQ%3D%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRgIhAPfnmTL7_aM6gfwZzFNGjD8PlXXnKt-sUW3hiwgcGQ90AiEA6WARIHUwPtj6CMARdgQKXzGa1fgsC4eAvRxKUR_Gr_k%3D&host=rr6---sn-tt1eln7e.googlevideo.com&title=Toby & Felix Show Off Their RAT Collectibles-lYPwQuGCcvQ.mp4
The title
parameter is at the end. The ampersand in the video title is not escaped, so the title parameter is Toby
(with a space at the end) and the rest is garbage. Someone who controls the video title can insert arbitrary parameters into this url.
The immediate fix is to replace +
with %20
in content-disposition headers. There's still the issue of inserting arbitrary url parameters though...
It's weird that the title in the download widget's json is urlencoded instead of being json-encoded. example:
GET /latest_version?download_widget={"id":"lYPwQuGCcvQ","itag":"22","title":"Toby+%26+Felix+Show+Off+Their+RAT+Collectibles-lYPwQuGCcvQ.mp4"}
Urlencoded strings happen to always be safe json strings, undeniable bad vibes tho.
If you urlencode instead of json-encode the title, either you have to trust that the title is urlencoded (which allows inserting arbitrary url parameters, impact probably limited but why trust unnecessarily?), or you don't trust the title so you have to try to decode it and then encode it back again for the location header, which for legitimate input is a no-op.
The security issue is probably of no consequence but IMO you must still defend against it because you shouldn't be able to control url parameters like that. But to me it just feels wrong to even spend cpu cycles doing what is a no-op for legit input. If the title was json-encoded instead, there would be one decoding step json -> plaintext
(to get the title) and one encoding step plaintext -> urlencoded
(for the location header).
json -> plaintext -> urlencoded
(short, semantic, amazing).json -> unsafe urlencoded
(unsafe location header, you control it).json -> unsafe urlencoded -> plaintext
(unsafe location header, video title controls it).json -> unsafe urlencoded -> plaintext -> urlencoded
(ridiculous).@babababag thanks for so many details :D Will look into that!
Steps to reproduce
This happens because the filename is urlencoded before being sent in the Content-Disposition header (here and here). e.g.
content-disposition: attachment; filename="lorem+ipsum"; filename*=UTF-8''lorem+ipsum
Urlencoding is not compliant in the
filename
directive but apparently browsers ignore the RFC and decode it anyway. Urlencoding is ok in thefilename*
directive but only with percent symbols, so+
means a literal plus not a space. https://httpwg.org/specs/rfc6266.html#disposition.parameter.filenameThe fix is to replace
+
with a literal space. I would PR but idk crystal and it'd take 2 seconds for one of you guys to do it anyway.