Actually restrict the Allowed Avatar Formats.

Spikestuff commented 3 months ago

From the profile settings page it says:

Allowed formats: .png, .jpg, .gif

So maybe we should peek the url and restrict it and call an error when we get a website url or a non-usable link? (And another to the anti-abuse, but the only way would you know this is by using a mobile.)

Also a way to deal with a fake format link which would bypass the logic we have. Like the server must confirm it's actually an image, and not a deadlink or a example.com/aurl?q=foo#bypass.png.

YoshiRulz commented 3 months ago

This suggestion runs into the same problems that were raised in #1711.

nattthebear commented 3 months ago

How does this work if the url returns different results to different requests?

YoshiRulz commented 3 months ago

It doesn't. Only using a checksum or switching to uploaded images can defend against that kind of attack.

"it goes into the same kind of bucket as javascript side checks, it's honor system really, but it will filter out well meaning mistakes and lazy abusers, which is going to be most things" --adelikat on Discord

adelikat commented 2 days ago

So all we can do is check the url and that it ends with an acceptable extension

adelikat commented 2 days ago

So some problems with actually trying to enforce an extension (examples based on real user avatar paths): foo.com/id foo.com/id?format=png foo.com/id?format=png&size=125x125 foo.com/path.png?size=125x125

these are legitimate avatars that meet the site requirements. I could try to look for the extension somewhere in the path, and without the dot, but the more I look for, the less we are actually enforcing. I'm not sure there is much value in this.

YoshiRulz commented 2 days ago

When the requirements are only to detect a select few filetypes and extract the image dimensions, most of the problems with server-side processing go away since it's just magic bytes + parsing of integers—foolproof.

PNGs always begin [ 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A, 0x00, 0x00, 0x00, 0x0D, 0x49, 0x48, 0x44, 0x52 ] followed by the width (u32 BE: [ 0, 0, 0, < 125 ]) and height (ditto).
(JFIF) JPEGs always begin [ 0xFF, 0xD8, 0xFF, 0xE0, 0x00, 0x10, 0x4A, 0x46, 0x49, 0x46, 0x00 ]. The subformats look super confusing, but TIL it uses a really clever system somewhat like escape characters, and you need only scan through for one of a few start-of-frame markers such as [ not 0xFF, 0xFF, 0xC0, _, _, _ ] which will be followed by the height (u16 BE: [ 0, < 125 ]) and width (ditto).
GIFs always begin [ 0x47, 0x49, 0x46, 0x38, 0x37 or 0x39, 0x61 ] followed by the width (u16 LE: [ < 125, 0 ]) and height (ditto).
WebPs are a bit more involved, having 3 subformats, plus the spec seems to be missing 4 octets compared to the output of magick convert, so I'm not sure what to do about those.

nattthebear commented 1 day ago

According to https://datatracker.ietf.org/doc/html/draft-zern-webp, webp always begins with ASCII "RIFF", followed by the filesize minus 8, followed by "WEBP".

adelikat commented 1 day ago

I'm I being overly paranoid that I don't want to do this server side? I have to make a call out to an unknown side, pull the image and process it, only to throw it away. I feel like this opens up a potential attack vector.

YoshiRulz commented 1 day ago

No, it's perfectly understandable that you wouldn't want ImageMagick, written in a non-memory-safe language, ingesting untrusted data on the same machine as the site. I'm saying there's another way if you're willing to put in some effort. I would assume the BCL has some way to safely make a GET request and read the first N bytes?

adelikat commented 1 day ago

don't assume, please look it up please. I don't think there is a way without making and completing the request in memory, and even then, it is the same vulnerability but with less bytes?

YoshiRulz commented 1 day ago

proof-of-concept

TASVideos / tasvideos

Actually restrict the Allowed Avatar Formats. #1799