Creating a new `PixelImage` type.

image-rs / image

Encoding and decoding images in Rust

Apache License 2.0

4.98k stars 618 forks source link

Creating a new `PixelImage` type. #2262

Open ripytide opened 5 months ago

ripytide commented 5 months ago

From #1523 it was discussed that renaming ImageBuffer to PixelImage or similar would be beneficial to emphasize the strong dependence on Pixel types like Rgb and Luma over other formats, especially compressed formats like .jpg represented as [u8].

Since we are renaming it I though it might be a good idea to discuss some other related design choices of the PixelImage type.

At the moment ImageBuffer is defined as:

pub struct ImageBuffer<P: Pixel, Container> {
    width: u32,
    height: u32,
    _phantom: PhantomData<P>,
    data: Container,
}

I have several questions when looking at this:

Why expect Container to Deref<Target=[P::Subpixel]> rather than Deref<Target=[P]> which is more strongly typed?
Why use a generic Container rather than using Vec<P> when Deref<Target[P::Subpixel]> is a required bound for nearly all methods?

Dependent on the answers to these questions, I would like to add an new type to the library:

pub struct PixelImage<P> {
    width: usize,
    height: usize,
    pixels: Vec<P>,
}

If we need a borrowed version of PixelImage we could create that also with an impl Borrow + impl Deref<>:

pub struct BorrowedPixelImage<'a, P> {
    width: usize,
    height: usize,
    pixels: &'a [P],
}

One yet unresolved question about this new type (from #1523) is:

Should we add a, yet to be implemented, ImageMetadata struct to it?

And then slowly deprecate and replace the old ImageBuffer type with PixelImage over the next few releases. What do you think?

HeroicKatora commented 4 months ago

Why expect Container to Deref rather than Deref which is more strongly typed?

Why use a generic Container rather than using Vec
when Deref<Target[P::Subpixel]> is a required bound for nearly all methods?

These are answered together. Because it is more strongly typed. Note that subpixel types are not specific to image while using P: Pixel is. It's feasible to use a container that doesn't even know about image or its pixel representations to hold the data for these pixels. An external type might incidentally implement Deref<[u8]> but surely it doesn't Deref<[Rgb<u8>]> and can then only be used in the former case with the current design. The second answer is quite strongly connected with the question in #1523, of whether these types are primarily simple constructors. I think there's definite value in imbuing an existing container with the layout of a pixel matrix, but it's not entirely clear to me if FlatSamples / or flat::View would already fill this niche entirely and thus eliminate the need to have the more opinionated owning representation. It'd be a simpler type having fixed the second parameter to Vec<_>, that's for sure. And it's unclear to how many uses the independence mentioned initially applies in practice.

Also, the borrowed PixelImage might be flat::ViewMut unless there are sufficient invariants that justify a separate type which might arise from PixelImage choosing the layout whereas the user has some control of layout with ViewMut.

Following #1523, ImageMetadata would not apply to the type. But I'm not as sure here, this may depend on the need for control in its save_as method, which will need to create such metadata and can only use the trivial choice without such attributes. It's not necessarily needed. At least in my idea of #1523 the users and use cases which do need detailed control can use an intermediate conversion and augment the buffer type with metadata in this representation. This way, it's not necessary to add Metadata to the PixelImage struct as a field, I think.

ripytide commented 4 months ago

An external type might incidentally implement Deref<[u8]> but surely it doesn't Deref<[Rgb]> and can then only be used in the former case with the current design

Ah ok, I think I can see the use-case:

let byte_vec: Vec<u8> = from_external_function();

let image: Image<Rgb<u8>, Vec<u8>> = Image {
    width,
    height,
    phantom: PhantomData,
    pixels: byte_vec,
};

Is that right?

In which case couldn't the user just use bytemuck to cast their Vec into a more strongly typed version if they know it's underlying representation (which they would need to for the above example also anyway):

let byte_vec: Vec<u8> = from_external_function();
let rgb_vec: Vec<Rgb<u8>> = bytemuck::cast_vec(byte_vec);

let image: Image<Rgb<u8>> = Image {
    width,
    height,
    pixels: rgb_vec,
};

Following https://github.com/image-rs/image/issues/1523, ImageMetadata would not apply to the type.

I see two options here:

Make an extra type for a metadata-aware PixelImage

struct PixelImage {pixels: Vec<P>}
struct PixelImageWithMetadata {pixel_image: PixelImage, metadata: ImageMetadata}

Make Metadata optional somehow within PixelImage

struct PixelImage {pixels: Vec<P>, metadata: Option<ImageMetadata>,}
// With an enum variant on ImageMetadata itself.
struct PixelImage {pixels: Vec<P>, metadata: ImageMetadata::Unknown,}

I don't feel too strongly either way between these two options as I feel like I could make either option work for all use-cases, the only slight downside to option 1. is that it makes two types so could be harder to work with compared to 2.

EmiOnGit commented 4 months ago

I'm rather new to this wonderful crate so excuse my lack of knowledge. What exactly should a potential ImageMetadata struct contain?

Adding metadata to the PixelImage feels a bit weird because the vast majority of operations on a PixelImage are direct transformations on the pixels itself. Metadata is not required for these operations and therefore adds unneeded complexity on the type. In the current implementation, I, as a enduser of the crate, am sure that all methods of PixelImage are not influenced by some metadata stored in the image. This makes the operations more transparent(without reading the source code or hoping for good docs).

I would be more in favor of having a new type, which holds the PixelImage and a ImageMetadata or don't combine them at all, but take the metadata as an argument to methods (if the amount of such methods is not to large).

ripytide commented 4 months ago

The main information I was thinking about was color-space information. Perhaps ColorSpaceInfo would be a better name so as not to be confused with .jpg metadata such as GPS information.

Some core operations require this information for accurate color-conversion, such as if you were to convert an PixelImage<Rgb<u8>> to an PixelImage<Gray<u8>> then the source and destination color-spaces are required knowledge, ie linear-RGB to gamma-corrected grayscale.

This type of conversion could be implemented on PixelImageWithMetadata slightly more easily than PixelImage since the information would be attached to it. If you have 50 images which might each have a different color-space then I think you would need PixelImageWithMetadata, that's the main use-case I can think of.

There's also the potential scenario where you may or may not know the color-space information about an image at run-time, this would require a hybrid type somewhere between PixelImage and PixelImageWithMetadata, something like:

struct PixelImagePossiblyWithMetadata {
    pixels: Vec<P>,
    metadata: Option<ImageMetadata>,
}

This might be a useful type for applying best-effort color-space conversions between image pixel formats.

kornelski commented 4 months ago

I used metadata for:

Tracking color space of the pixels, which could be an sRGB flag, a custom ICC profile (I do need to put a parsed lcms2::Profile in there!), or an enum with DCI-P3 or Rec.2020. Treating pixels as just "RGB" the same kind of problem as treating strings as bytes.
JPEG markers other than ICC blob
Making a note whether the decoded image used chroma subsampling, palette, or lossless format and bytes per pixel of the original compression, to guide selection of the best file format when saving the image. Tracking input JPEG quantization tables is on my todo list, because saving with the same tables or integer scaling of them reduces degradation.
Original file size, to avoid creating useless thumbnail files that are larger than their source.
Loop count of GIF/AWebP
File modification timestamp, to have an option to preserve the modification date.
Logging which decoder & decoder version has been used to load the image. If I later discover that the decoder was buggy, I need to be able to track all the files I've used it for, and recreate them (I had that happen, and had to use this ability in practice!)
Actual original size of the image when decoding JPEG at n/8th of the size (because the final desired size may be something like "33% of the original size" and the later resizing step needs to calculate that properly).

So there can be a lot of application-specific metadata, and not all of it will be useful to everyone. Therefore I suggest having something like extensions in the HTTP crate: https://docs.rs/http/latest/http/struct.Extensions.html

ripytide commented 4 months ago

I suppose the alternative to using a type-map like Extensions for ImageMetadata would be a strongly-typed struct like this:

struct ImageMetadata {
    color_space_info: Option<ColorSpaceInfo>,
    original_file_size: Option<u32>,
    original_image_dimensions: Option<(u32, u32)>,
    loop_count: Option<u32>,
    file_modification_timestamp: Option<u64>,
    decoder_info: Option<DecoderInfo>,
}
struct DecoderInfo {
    filetype: FileType,
    decoder_name: String,
    decoder_version: String,
}
type ColorSpaceInfo = image_canvas::Color;

Using the very comprehensive Color from image-canvas by @HeroicKatora I saw recently.

This struct would likely grow as people requested different types of metadata (like GPS info), but this could be done backwards-compatibly if we made all the fields private and used a builder-style API for it. Alternatively, we could make breaking changes by expanding it but teach people to use ..Default::default() when declaring a struct instance which would be backwards-compatible-ish.