image-rs / jpeg-decoder

JPEG decoder written in Rust
Apache License 2.0
149 stars 87 forks source link

Type-safe handling of various pixel types #223

Open kornelski opened 2 years ago

kornelski commented 2 years ago

The pair of pixel_format and pixels types is very C-like, without type safety.

This is especially clear with the addition of L16 format, which is difficult to safely read. Reading it as .chunks(2) requires dealing with endianness (the docs don't say which endian it is), and casting it to &[u16] is not possible to do safely due to risk of insufficient alignment. So assuming it's a native endian, it requires a totally unsafe loop with pointer arithmetic and ptr::read_unaligned.

It would be better to expose pixel formats in a type-safe manner, like:

enum Pixels {
    L8(Vec<u8>),
    L16(Vec<u16>),
    RGB24(Vec<RGB<u8>>),
}
HeroicKatora commented 2 years ago

The bytemuck crate provides pod_collect_to_vec as a safe method to get a Vec<u16>, if required.

With some insight on the state of Rust graphics programming—including the sentiment derived from wgpu: do not strongly type image buffers— my desire is to have as few different type variants mentioned in the interface as possible. Both tiff and image::DynamicImage is fix-worthy in this regard, and in that they currently have a similar interface as above but, my summary, it doesn't scale and it's limiting with regards to the supported (pixel) formats. It was a subpar idea, and it encapsulates little relevant information one couldn't do otherwise.

Firstly, the above requires agreeing on one particular exact SemVer type for RGB. I'm sorry, but no. Many callers care about the data foremost and turning enum variants into runtime representation (of 4CC flavor) is super akward compared to just getting one in the first place. Secondly, it requires agreeing on one particular style of allocation. The container Vec<u8> can not be re-used for Vec<Rgb<u8>> (as the struct size differs) so that the caller can not unify to a common representation those even if they wanted to. Even with an Allocator type argument to vector I do not see it as solved: as you notice yourself, sometimes you want overalignment guarantees which you couldn't encode in the type system properly while staying interoperable.

In short: it's a vain effort to try to fulfill all the different requirements users may have on the allocation and container type of their data with standard containers; and the more so by a finite list of enum variants. So we don't attempt to. Instead, rely on the fact that all channel types are simple and bytes compatible. This may require one re-allocation at the moment but the better fix for this would be to take a user-allocated buffer (&mut [u8]) as an argument instead. This would present a better story as the above would be merely a utility interface on-top of the raw interface available to fulfill all the extra requirements.

kornelski commented 2 years ago

It's safe and zero-copy to cast &[u16] into &[u8], but no the other way round. u16 is also unambiguous about the endian.

Current state in jpeg-decoder is pretty bad to the point I'm panicking if I get L16.

HeroicKatora commented 2 years ago

to the point I'm panicking if I get L16.

That definitely does not sound right. Can you expand? Is this a panic within jpeg-decoder or outside caused by some (broken) promise derived from it?

kornelski commented 2 years ago

I mean I've chosen not to handle L16 the hard way with pointer arithmetic, and if the vec doesn't happen to be aligned, I call panic.

HeroicKatora commented 2 years ago

It's safe and zero-copy to cast &[u16] into &[u8], but no the other way round.

Not zero-cost but one alloc+memcpy, yes. Although all but the simplest case will incurr this cost if we chose to provide Vec<u16> as well. Copying to graphics memory doesn't mind the difference, you give it a binary buffer. And even many CPU-only libraries will insist on their own allocation for the graphics buffer itself anyways. If it's the precious cost of allocation you're worried about then the more immediate critique should be in allocating and returning a fresh buffer in the first place.

Are you content with the &mut [u8] out-parameter interface as offered by png, if documentation of endianess were better?

kornelski commented 2 years ago

I actually do need to copy the buffer myself to yet another representation (in this particular case I wanted 8-bit RGB or RGBA regardless of input format), so lack of ability to get a slice means I need to copy twice.

I get that you're trying to return the lowest common denominator, but due to endian and alignment, Vec of u8 isn't it.

The need to use bytemuck and copy isn't as much of a perf issue as annoyance that I need extra code and extra inefficiency only for something that I consider an inelegant and dated API design. To use it without unsafe I need to be mindful of the endian, ugh.

It's a C-style type-unsafe pattern. If I keep the pixels in this format, it's error prone on every use (I have to keep the two fields in sync). OTOH Enum variants for pixel layouts force checking where it's relevant, and allow branching into statically-typed code paths and generics.

I know lots of 90's-style APIs use a char pointer and runtime-dynamic arbitrary bytes per pixel, but this is un-Rust-like. I'm trying to leave this style behind.

A match on an enum of pixel types can safely (and zero-cost) cast all of them as a slice of bytes where it's necessary to talk to C. But this doesn't work the other way: an untyped bunch of unaligned bytes can't be cast to better types for Rust. So you're exposing the least flexible low-level "C-style-only" representation, rather than a high level one that can either use Rust idioms or C idioms.

If you don't want to depend on my rgb create that's fine, but [u8;3] or your own RGB type could work.

Or even if you don't believe that pixels should be Vec elements, at least use an enum of {Eight(Vec<u8>), Sixteen(Vec<u16>)} with an element per channel, so that it doesn't need assembling u16s from raw unaligned data.

fintelia commented 2 years ago

The more I've dealt with the different ways of dealing with encoding image data with Rust types, the more convinced I am that there is no elegant options, only varying degrees of bad options. The "type safe" design you describe in your original post is the one we used in the image-tiff crate, and it can cause tons of boilerplate.

In this case, the option I'd probably prefer would be having decode write its output into a &mut [u8]. So for instance if you knew you were decoding a L16 image, the usage would look something like:

decoder.read_info()?;
let info = decoder.info().unwrap();
assert_eq!(info.pixel_format, PixelFormat::L16);

let mut data = vec![0u16; info.width as usize * info.height as usize];
decoder.decode(bytemuck::cast_slice_mut(data))?;
vstroebel commented 2 years ago

Or even if you don't believe that pixels should be Vec elements, at least use an enum of {Eight(Vec<u8>), Sixteen(Vec<u16>)} with an element per channel, so that it doesn't need assembling u16s from raw unaligned data.

The lossless decoder that produces L16 creates an Vec<u16> before converting it to Vec<u8> anyways. Directly exposing the Vec<u16> would remove the useless conversion for everyone that needs to convert the data anyways. The only downside is that it would make the API a little bit inconvenient for 99.9% of users that won't get in touch with lossless encoded JPEGs. But this pain could be reduced with some kind of into_u8_vec() or similar on the enum.

kornelski commented 2 years ago

The boilerplate you've linked to is not bad. It can be moved to a method, even AsRef<[u8]>.

fintelia commented 2 years ago

The boilerplate you've linked to is not bad. It can be moved to a method, even AsRef<[u8]>.

Sorry, my comment wasn't clear. Each of the linked words is different part of that file. In total there's probably 400+ lines of boilerplate code in that one file alone.

kornelski commented 2 years ago

The first link that sets up the types and constructors is a bit boilerplatey, but it could be made cleaner with a couple of declarative macros.

The remaining three links are actually useful work. If you don't do it in your code, you're not avoiding this boilerplate, you're pushing it down on users. This is exactly the problem I came here to complain about. Now I have to either do copies in endian-aware way myself, or write even nastier unsafe code for unaligned reads. That's not saving boilerplate. It's cutting out code by providing less useful API and requiring more busywork for your users.

I don't understand how am I supposed to process 16-bit image data without ever converting the pixel data to u16s. This code has to live somewhere.

kornelski commented 1 year ago

I'm using this crate again, and again I'm running into the undocumented endian + no safe zero-copy use of L16.