image-rs / image

Encoding and decoding images in Rust
Apache License 2.0
5.01k stars 619 forks source link

Storing images to FITS files #2214

Closed sunipkm closed 7 months ago

sunipkm commented 7 months ago

The Flexible Image Transport System, FITS, is an open standard used extensively in astronomy (both amateur and scientific) and scientific research. The FITS standard supports storing images in a lossless format with optional data compression, with comprehensive metadata support. Adding FITS support to the image crate will be great for extending its use in the scientific data collection space.

Caveat: Currently, the only feature-complete FITS library for Rust is the fitsio crate. This crate wraps the C libcfitsio library in order to provide FITS functionality, which cannot be (easily) compiled on the more exotic targets (e.g. wasm32-unknown-unknown) as the object files from libcfitsio are required.

fintelia commented 7 months ago

Before adding FITS support, I'd want to understand how commonly the file format is used. In particular, since the format supports up to 999 dimensions (!) and a whole bunch of channel formats that this library cannot handle, it would be good to know how common it is for 2D arrays of pixels

However, before getting there the libcfitsio dependency is somewhat of a blocker. We really try to avoid C dependencies and have been working to remove the ones already present (which are already all gated behind non-default feature flags). There's a bunch of reasons for it including the portability concern you mention, compatibility with fuzzing, and limiting unsafe code.

None of this is to say that there's anything stopping you from using this crate alongside fitsio today. We very intentionally expose ways of getting access to the contents of our image types, which means that you could write an external crate to do the necessary adaptation to interface with fitsio. In fact, this could even be a path towards building a pure-Rust FITS crate that could later serve as a codec backend for this crate, in much the same way that we internally use the png, tiff, zune-jpeg, etc. crates.

sunipkm commented 7 months ago

In astronomy FITS is THE standard to go with - everything from Hubble to JWST to you name it uses FITS. Most images are however monochrome, and image can only support FITS encode and not decode.

I agree that the libcfitsio dependency is somewhat of a blocker, which is why in my implementation it is hidden behind a feature flag. I am working on a pure rust impl of fitsio, but that will definitely take time.

fintelia commented 7 months ago

image can only support FITS encode and not decode

Why is that?

I agree that the libcfitsio dependency is somewhat of a blocker, which is why in my implementation it is hidden behind a feature flag. I am working on a pure rust impl of fitsio, but that will definitely take time.

I don't think we should add this as a C dependency even behind a feature flag. Please do share updates on the pure Rust implementation you're working on though, I think that could be quite useful for the Rust ecosystem!

sunipkm commented 7 months ago

Since image crate does not support arbitrary dimensions etc, I do not see why image crate has to support reading an arbitrary FITS file, which may not even contain an image. FITS, however, does support storing the kind of images that image crate works with, so writing to FITS poses no such problems.

Is there a rationale behind not adding a C dependency behind a feature flag?

I must also add, before FITSIO goes through, adding image metadata will be imperative. I don't see too many people being interested in a FITS file that simply contains image data and no other associated information.

fintelia commented 7 months ago

I must also add, before FITSIO goes through, adding image metadata will be imperative. I don't see too many people being interested in a FITS file that simply contains image data and no other associated information.

Could you say a bit more about the flow here. Is the idea that the user would load a PNG or JPEG containing an EXIF chunk, and then we'd parse that chunk and write each of the metadata fields into the FITS file?

Is there a rationale behind not adding a C dependency behind a feature flag?

Yes, a lot of the maintainer overhead is the same regardless of whether the dependency is behind a feature flag or not. We'd need to be able to run tests for the module locally, build and test it in our CI, and ensure that the module was included in the generated documentation on docs.rs. It also takes away from other goals like getting higher levels of fuzzer coverage or reducing the amount of unsafe code. And once people are using the C dependency it becomes a potentially breaking change to switch to a pure Rust implementation if there isn't perfect feature parity, which can make it controversial as we discovered when moving away libwebp.

sunipkm commented 7 months ago

Could you say a bit more about the flow here. Is the idea that the user would load a PNG or JPEG containing an EXIF chunk, and then we'd parse that chunk and write each of the metadata fields into the FITS file?

I do not see a user loading a PNG or JPEG image, containing an EXIF chunk, and writing that into a FITS file. Data for scientific applications would already be received as a FITS file by the generating application. The use case for this, really, is storage of captured images. DynamicImage is great for packaging an image in a Rust application that is operating a camera. The relevant metadata is also generated at the time of capture, such as timestamp, ROI, detector temperature, etc. The user, if necessary, can then perform simple operations (such as rotation, flipping) with the standard interface that DynamicImage provides, and write it out to a FITS file. This is also the reason image crate does not need to be able to import arbitrary FITS files.

Shnatsel commented 7 months ago

I think it's best to provide this functionality outside the image crate itself. All the necessary types and traits for it are public. You can look at the webp crate that wraps libwebp in C to see how it's done.