matrix-org / matrix-rust-sdk

Matrix Client-Server SDK for Rust
Apache License 2.0
1.21k stars 240 forks source link

Add the option to strip sensitive EXIF data from images #2325

Open anoadragon453 opened 1 year ago

anoadragon453 commented 1 year ago

It would be a nice addition to Matrix clients to help preserve privacy by stripping sensitive EXIF data from images that are uploaded. Obviously non-sensistive metadata, such as image rotation, should be preserved.

I propose an option rather than default-always stripping sensitive EXIF data, as there are some situations (perhaps you are sharing images between your forensic analysis peers) where you do not want the client to modify the image file.

poljar commented 1 year ago

We discussed this a while ago when EX needed EXIF stripping, while I agree that it would be nice to have this functionality, the only viable crate for this seems to be [rexiv2]. This crate wraps the Exiv2 C++ library.

We're trying to minimize or ideally not depend on non-Rust libraries, since that tends to complicate cross compilation quite significantly.

anoadragon453 commented 1 year ago

That's entirely fair. If others have any alternative suggestions, please list them on this issue to help unblock it. A place to start looking may be other Rust-based applications that do exif parsing. Perhaps they could separate their logic out into a maintained crate.

I took a look personally and could only find little_exif, which while is pure Rust and does allow for reading/writing; it hasn't see a commit in close to a year.

bnjbvr commented 10 months ago

For what it's worth, I was thinking about this in the back of my head, and stumbled into little_exif too, which is licensed with the regular MIT or Apache2 license. The crate seems simple and small enough that we could import the required subset for reading the exif data and namely the GPS tags, remove them, and saving the metadata back into the file.

poljar commented 10 months ago

Yeah, that might be doable. Though we likely need to remove the metadata without writing a file, i.e. do it on the fly while we're uploading/encrypting the data.

timokoesters commented 8 months ago

Does element x already generate thumbnails for images? Is that handled by the sdk? Probably the same dependency can be used to do a deserialize-serialize cycle to get rid of metadata. In Rust that would be the image crate.

Hywan commented 8 months ago

I wonder if the SDK is responsible for doing that. At least for the EXIF only. To me, it looks like it's either the responsability of the client, or the server, but not the SDK.

Right now, when we upload a Media, we take the raw bits and send them (with a mime type). We don't modify the content of the media.

Anyway, if we decide to take this path, we must introduce an intermediate type to represent a media content, and then provide some features on it, like resize, strip_exif and so on.

I'm asking @stefanceriu, @bmarty and @zecakeh to know if it would make clients' (like Element X and Fractal) life simpler: do you want the SDK to provide an API to manipulate the media (removing EXIF, resizing, stuff like that, please ask more if you have ideas)?

zecakeh commented 8 months ago

There is already an API that does some image manipulation (resizing) to generate the thumbnail from the raw bytes, with AttachmentConfig. It could be easily extended to add new pre-processing features before sending the media.

At the time it was also planned to include blurhash generation, but we didn't given the lack of maintained Rust library. Although now the blurhash crate has found new maintainers, and there is talk about switching to thumbhash, which has a Rust implementation.

It would certainly make our life easier, because we don't have to implement these features ourselves, but if we are the only app that would benefit from this right now, it might not be worth it for the SDK.

stefanceriu commented 8 months ago

Element X iOS already strips exif data, generates blurhashes and thumbnails (1, 2) and converts videos. Most of the work is done in the MediaUploadingPreprocessor.

They weren't particularly hard to implement and I would much rather trust system libraries for dealing with media, especially for videos. I wouldn't necessarily expect the SDK to do this for us but I can see it be useful as a separate crate/helper (especially for rust clients)