superseriousbusiness / gotosocial

Fast, fun, small ActivityPub server.
https://docs.gotosocial.org
GNU Affero General Public License v3.0
3.86k stars 338 forks source link

[bug] 422 Unprocessable Entity error for WebP image with watermark #3579

Closed Rengyr closed 2 days ago

Rengyr commented 3 days ago

Describe the bug with a clear and concise description of what the bug is.

When trying to make a post with a WebP photo created in DigiKam with watermark, the upload fails on "422 Unprocessable Entity". Same photo without the watermark uploads correctly. The photo with the watermark (attached here) works fine on a mastodon instance and desktop image viewers. test_image.zip

What's your GoToSocial Version?

GoToSocial 0.17.3+git-6f4cb2f

GoToSocial Arch

amd64 binary

What happened?

Posting status with an attached WebP image with DigiKam watermark fails to post.

What you expected to happen?

Correct upload of the photo to the GTS instance.

How to reproduce it?

Anything else we need to know?

The watermark was added in DigiKam (8.3.0): https://userbase.kde.org/Digikam/Watermark

daenney commented 2 days ago

That's very interesting. I always thought watermarks are blended into the image but turns out that's not the case. A little surprising since that makes it trivial to remove them. But I'm guessing this is what's tripping us up during processing, with potentially more than one image layer being present.

Rengyr commented 2 days ago

I've noticed that during the conversion to the .webp from original .tiff file the exif information from .tiff was left there and it might trip something on the GTS decode/encode side? Technically the file is "malformed" as the exif is wrong in regards to the converted .webp format, but it shouldn't be used for anything on GTS side and it is stripped in the end anyway afaik.

Edit: If the images are always stripped on the GTS side I would maybe strip it before processing the file (if that is possible?). Otherwise it is on you to decide if it is worth fixing/adding support for webp files with wrong exif.

kouya commented 2 days ago

This was a really weird one and I have no idea what happened really. We changed something on the file and while it worked to upload, GTS converted it into this after upload https://icy.arcticfluff.eu/fileserver/01H7M8TDKTPK27P4XVHP29P8RT/attachment/original/01JDQJV0AF8JM4TYB1837GJTXB.webp

It's a malformed webp that mastodon refuses to read. Funnily Firefox displays it while all my image software says the file is invalid.

I have rewritten the original files with Krita and they work. All I could find out was that the exif information said that the images were 16 bit per channel. Maybe that confused the golang image library? Though I think it had nothing to do with watermarks as watermark in this case was just overlayed text (I think) but weirdly exported webp image. I don't think the format supports multiple layers? I might be wrong though.

Edit: for reference in case you want to figure something out with it. The original file that lead to the corrupted one linked in this post https://files.arcticfluff.eu/IMG_8016w.webp It's slightly different from the original test file, I think only the watermark addition is the difference, but it worked to upload.

tsmethurst commented 2 days ago

I tried opening the test image in GNU IMP just to see what it was and it actually crashed the program :') For GtS, it looks like the 422 is occurring not from stripping the exif data with exif-terminator, which seems to have no problem managing it, but at the thumbnailing stage when our embedded ffmpeg tries to read the image in order to create a thumbnail out of it:

Error #01: StoreLocalMedia: error processing media: store: error generating image thumb: ffmpeg: non-zero return code 1 (Error while decoding stream #0:0: Invalid data found when processing input\nCannot determine format of input stream 0:0 after EOF\nError marking filters as finished\n)\n

NyaaaWhatsUpDoc commented 2 days ago

okay so some further confusion to add to this: if i remove "webp" as a possible target for exif-terminator, so that metadata cleaning is instead handled by ffmpeg, the file is correctly processed :')

i'm gonna start investigating whether there's something we can be doing differently in exif-terminator :p

(though it is worth noting that ffmpeg's metadata cleaning is best effort, so exif-terminator may actually be cleaning the file correctly, and just any cleaning of it may be what's breaking it)

kouya commented 2 days ago

I feel bad we created such an abomination image. D:

But it does really have something to do with metadata it seems. If I run exiftool -overwrite_original -all= IMG_7960_test.webp to strip all (or most) possible metadata on the original attached file, it posts completely fine.

NyaaaWhatsUpDoc commented 2 days ago

and fixed it :D https://codeberg.org/superseriousbusiness/exif-terminator/pulls/12

(it turns out we weren't iterating through all available Webp image chunks. i also added some little optimizations too :p)