Open johannesvollmer opened 3 years ago
Going back a step and merging some of the unnecessarily split traits, this could be an alternative design:
/// A pixel containing a sample for each channel.
/// Grants access to the sample data, and does nothing more. No conversions.
/// Basically nothing but a trait for wrapped primitive arrays, which is aware of alpha channels.
pub trait Pixel: Sized + Copy + Clone + Default {
/// The type of each channel in this pixel.
type Sample: Primitive;
const CHANNEL_COUNT: u8;
const HAS_ALPHA: bool; // assumes no more than one alpha channel per pixel
/// A slice of all the samples in this pixel, including alpha.
fn as_slice(&self) -> &[Self::Sample];
/// A slice of all the samples in this pixel, including alpha.
fn as_slice_mut(&mut self) -> &mut [Self::Sample];
// ### default implementations. these could also be put in a separate PixelExt trait ###
fn bytes_per_channel() -> usize { ... }
fn bytes_per_pixel() -> usize { ... }
fn alpha_channel_count() -> u8 { ... }
fn color_channel_count() -> u8 { ... }
fn from_iter(iter: impl IntoIterator<Item=Self::Sample>) -> Self { ... }
fn apply(&mut self, mapper: impl FnMut(Self::Sample) -> Self::Sample) { ... }
fn apply2(&mut self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) { ... }
fn map(&self, mapper: impl FnMut(Self::Sample) -> Self::Sample) -> Self { ... }
fn map2(&self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) -> Self { ... }
fn as_color_slice_and_alpha(&self) -> (&[Self::Sample], Option<&Self::Sample>) { ... }
fn as_color_slice_and_alpha_mut(&mut self) -> (&mut [Self::Sample], Option<&mut Self::Sample>) { ... }
fn as_color_slice(&self) -> &[Self::Sample] { ... }
fn as_color_slice_mut(&self) -> &mut [Self::Sample] { ... }
fn apply_without_alpha(&mut self, mapper: impl FnMut(Self::Sample) -> Self::Sample) { ... }
fn apply2_without_alpha(&mut self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) { ... }
fn map_without_alpha(&self, mapper: impl FnMut(Self::Sample) -> Self::Sample) -> Self { ... }
fn map2_without_alpha(&self, other: &Self, mapper: impl FnMut(Self::Sample, Self::Sample) -> Self::Sample) -> Self { ... }
}
/// Add an alpha channel, for example go from RGB to RGBA, or L to LA. No color conversion is happening here.
/// Implemented by all colors.
// This is a separate trait because it is not simply data access and has an associated type
pub trait WithAlpha: Pixel {
type WithAlpha: Pixel<Sample=Self::Sample>;
fn with_alpha(self, alpha: Self::Sample) -> Self::WithAlpha;
}
/// Take away an alpha channel, for example go from RGBA to RGB, or LA to L. No color conversion is happening here.
/// Implemented by all colors.
// This is a separate trait because it is not simply data access and has an associated type
pub trait WithoutAlpha: Pixel {
type WithoutAlpha: Pixel<Sample=Self::Sample>;
fn without_alpha(self) -> Self::WithoutAlpha;
}
/// Used to detect the image type when encoding an image.
pub trait InherentColorType {
const COLOR_TYPE: ColorType;
}
/// Implemented for all colors individually.
/// Allows converting the sample to a different sample type.
// This is a separate trait because it has an associated type
pub trait MapSamples<NewSampleType>: Pixel {
type NewPixel: Pixel<Sample = NewSampleType>;
fn map_samples(self, mapper: impl FnMut(Self::Sample) -> NewSampleType) -> Self::NewPixel;
}
// ### example color implementation ###
#[derive(Clone, Copy, Default)]
struct Rgb <T> ([T; 3]);
impl<T> Pixel for Rgb<T> where T: Primitive {
type Sample = T;
const HAS_ALPHA: bool = false;
const CHANNEL_COUNT: u8 = 3;
fn as_slice(&self) -> &[Self::Sample] { &self.0 }
fn as_slice_mut(&mut self) -> &mut [Self::Sample] { &mut self.0 }
}
// non-generic implementations for the color type, as not all sample types may be supported for a color
impl InherentColorType for Rgb<u8> { const COLOR_TYPE: ColorType = ColorType::Rgb8; }
impl InherentColorType for Rgb<u16> { const COLOR_TYPE: ColorType = ColorType::Rgb16; }
impl InherentColorType for Rgb<f32> { const COLOR_TYPE: ColorType = ColorType::Rgb32F; }
impl<T> WithAlpha for Rgb<T> where T: Primitive {
type WithAlpha = Rgba<T>;
fn with_alpha(self, a: Self::Sample) -> Self::WithAlpha {
let [r,g,b] = self.0;
Rgba([r,g,b,a])
}
}
impl<T> WithoutAlpha for Rgb<T> where T: Primitive {
type WithoutAlpha = Self;
fn without_alpha(self) -> Self::WithoutAlpha { self }
}
impl<T, D> MapSamples<D> for Rgb<T> where T: Primitive, D: Primitive {
type NewPixel = Rgb<D>;
fn map_samples(self, mapper: impl FnMut(Self::Sample) -> D) -> Self::NewPixel {
Rgb::from_iter(self.as_slice().iter().cloned().map(mapper))
}
}
I wonder if it might make sense to not use any enumerations for color formats, and instead just have RGB8 et al be type aliases. Const generics could also allow for >4 channel images. Perhaps the need to avoid changing the API is unfounded and its time for a 1.0.0?
@drewcassidy I'm not sure I follow. The reason that we use enums for color formats is because users can load files at runtime that can be in any format. The PngDecoder::color_type
method returns ColorType
to signal what format the PNG is. And that determines which variant of DynamicImage
the higher level methods like image::open
would return.
Furthermore to the excellent suggestions so far we could separate the pixel trait and types into a separate crate so other crates in the ecosystem can also make use of them.
I'd like to work on writing up this V2 pixel trait in a new crate and then attempting to refactor the image
crate onto it, @fintelia are you able to create a new repository in the ?image
github organization for it
At the moment I've started work on my pixeli repo but I'd love to tranfer that repo to the image
org if possible.
This is certainly a messy part of the API. If you especially want to work on this, I'd encourage you to read through the existing issues on Pixel
and GenericImage
and so forth, as well as discussion on color spaces, and on color type conversions. Without factoring in those considerations we'd likely end with a lot of the same issues as the current API.
TBH I'm kind of burned out trying to fix this part of the API. It is a huge undertaking that doesn't have any clear best solution. Plus even small changes like removing BGR component order can turn out to be quite disruptive for users. (At the time we didn't even know anyone was using that!) Not to mention that there's a big risk of things stalling out before any changes have landed
I have read through a lot of the issues discussed and while I can't guarantee that my current approach (based on the suggestions given above) will solve all of them I think it is a definite improvement. And moving all pixel-stuff into a separate crate will make this crate a lot cleaner also.
I'll try to send a PR switching this crate to use pixeli
tomorrow.
The difficulties mentioned by fintelia are real. I think a good first step might be to come up with a strategy to tackle the difficulties.
Wasn't there a plan to remove the image processing from this library into a separate library, and make this library I/O only? Is this still a goal?
Wasn't there a plan to remove the image processing from this library into a separate library, and make this library I/O only? Is this still a goal?
At one point there was though of splitting out ImageDecoder
and some of the other related traits and enums into an image-core
crate. The idea was that portion could rapidly iterate to 1.0 and then at least part of the API would be fully stable. We kinda gave up on that when realizing that every breaking release of image-core
would require a breaking release of image
, and thus be even worse than gradually improving those items the way we already were.
More recently, there's been some talk about moving some of the more advanced image processing methods to imageproc
. It hasn't been a priority because those methods mostly just live off in their own module and don't require much maintenance, but there isn't much harm either. Under this plan core image processing types/methods like Pixel
or crop
would stay in this crate
Wasn't there a plan to remove the image processing from this library into a separate library, and make this library I/O only? Is this still a goal?
It's being tracked in #2238, progress is slow but hopefully steady.
From https://github.com/image-rs/image/pull/2255#issuecomment-2156153746 @ripytide
The two key points missing are awareness of color spaces and single responsibility. Instead, this design encodes even more information into the type system and it will run into the same troubles.
Good point, I didn't tackle the color space issue at the same time, in that regard the four choices I can think of for attaching color-space to pixels are:
1. Enum in Pixel types:
enum ColorSpace { Unknown, Linear, ... } struct Rgba<T> { r: T, g: T, b: T, a: T, color_space: ColorSpace, }
2. Enum in Image Type
struct ImageBuffer<P> { pixels: Vec<P>, color_space: ColorSpace, }
3. Same as 1, but using a `PhantomData<U>` with a generic tag-type like `eudlid` [`Point`](https://docs.rs/euclid/latest/euclid/struct.Point2D.html) types. 4. Same as 2, but using `PhantomData` tags as well.
As you say the type-system heavy approaches in 3. and 4. are harder to reason with and learn, (is that similar to how
gfx
did things?). 1. Seems terrible for storage space which leaves options 2. as probably the best option given all the tradeoffs I think but I don't know if you had something else in mind?With option 2. I think
FromPixelCommon
would change to something like:trait FromPixelCommon<P> { fn from_pixel_common(pixel: P, source_color_space: ColorSpace, destination_color_space: ColorSpace) -> Self; }
Am I understanding correctly?
On the single responsibility aspect, which bit of this design goes did you see going against that principle? Are you suggesting moving the helper functions such as
component_array()
color_component_array()
and themap_*()
functions to an extension trait? I think I'd probably be fine with that but I'm not sure as to the advantages/disadvantages of separating them either.
(1.) doesn't work if pixels are to be in-place in a buffer. (2.) implies that the current trait methods for Pixel
all need this additional context. Not impossible but definitely requires design work. You might enjoy #1718 as an approach going in this direction. (3.) and (4.) are the rightfully critiqued and unwieldy options that didn't work out in gfx
.
What do we have the Pixel
trait and strong typing for? It is mostly a matter of beginner friendly input-output design. Some person having a design in mind needs a simple way for defining pixels. Another needs a simple way of getting out pixels in some workable format to pass on (to textures). In both these cases the expert users will look for texel byte buffers; I don't think we need to worry about them too much when desiging the indexing based ImageBuffer
.
However, we don't need to break too much either. If we constrain the roles of existing generic buffer types to the 'simplified user interface', we shouldn't have too many problems with giving well-defined constructors for the powerful buffers that will need care and runtime representations.
I'd like to propose the following stance, since I think it can be done incrementally:
ImageBuffer
and DynamicImage
are seen as types to give semantics to a pixel matrix, and nothing more. In particular we'd solve metadata, image sequences, color spaces in a separate type with constructions from each of these.Pixel
strongly imply sRGB
as a dynamic invariant for its types, that is something that operations one it should assume. This allows it to be closed and non-generic, resolving a few of the design conflicts. For Pixel<f32>
we try not to clamp. Out-of-range components should be treated via identity for all transforms (unless someone knows of a standard with another approach).Pixel
type stays used for ImageBuffer
, not necessarily in the interface beyond. The main impact is imageops
. This is okay, we don't intend imageops
to be usable for any kind of color matrix. I don't think we need to deprecate anything in a single release here.Pixel
bound to within ImageBuffer
(and flat::
for zero-copy mapping but might re-evaluate the need here).Beyond Pixel
, I'd like to just demonstrate that the plan of simplifying responsibility creates breathing room. If you want to debate any of these, different issues might be appropriate:
ImageBuffer
to PixelImage
or PixelBuffer
or PixelMatrix
, and DynamicImage
accordingly. This makes room for ImageBuffer
to be a new type that contains opaque, non-generic pixel representation. Basically, the expert type but similar to PathBuf
/OsString
we don't need to lie about the complexity to the user here either.ImageDecoder
trait gets to return the descriptor object, including metadata. It should be constructible from size and ColorType
but not necessarily exclusively from these. It should be possible to fallibly allocate both DynamicImage
and ImageBuffer
from this descriptor.image-canvas
already follows the needs and just requires a wrapper following the existing PR. In particular, I hope we can thus design a better descriptor including interaction with real world workable solutions such as 4cc in a separate crate for now, also relieving a bit of pressure here. Also hoping to reduce some pressure from having to deal with IO and operations in image
itself.DynamicImage
but doesn't promise this to fully preserve semantics. Also it must be constructible from a DynamicImage
. It should also be constructible from the descriptor type returned from the decoder.I'd agree with all you've said.
The pixel trait is used mostly in imageops
and imageproc
from what I've seen for writing functions which are generic over grayscale and color images (like filter()
). In fact most of the functions in imageprocs
are pixel-based hence the strong dependency on the Pixel
trait and ImageBuffer
types. I don't see the harm in also making the non-pixel based canvas which if I'm understanding correctly seems to be more heavily used in image encoding/decoding and GPU interaction which seems like a great use-case also.
I'm thinking such a large change as we are discussing might need quite a bit of planning how best to split it up incrementally as you suggested. Maybe a tracking issue or project board might be the best approach to organize the ordering of all required changes with an issue created for each individual minimal change like renaming a type.
I've seen the talk on problems in gfx-rs, but I don't think it applies to just pixel types. It seems it tried to do more complex types, and with harder constraint of not being in charge of its own API due to zero-cost wrapping existing non-Rust ones.
I've successfully written large image processing pipelines using strongly typed pixels. I don't think they are "for beginners". They ensure correct handling of bitmaps, and help with monomorphising code for various layouts (and there's always casting to bytes for code in the style of ptr += bpp
).
[u8]
in particular is a pain to work with 16-bit data. It has unspecified endianness and wrong alignment that prevents casting to [u16]
.
I think types like [RGBA<T>]
work well. There is however awkwardness of types like DynamicImage:
DynamicImage
must be another type. match
But the alternative of bytes plus some ColorFormat enum has the same problems with enums and traits, minus type safety.
So I think pixel types are fine. A good Pixel
trait could help write correct image processing code, beyond 8-bit.
I'm missing a better DynamicImage
. The Canvas
could be a good replacement, although "just upload to a GPU" is a bit too futuristic for me still, and I'd still like proper APIs for good'ol' CPU processing.
I usually need a separate notion of image file with all the metadata, color profiles, anim frames, and custom extras like file modification timestamps, original file format, and version of the codec. Then a thinner anim frame that knows its color profile, but isn't burdened by non-pixel data, and then 2D slices of pixels at the lowest level for actual byte pushing. AFAIK this crate lacks the full-fat image type to wrap DynamicImage in, and image decoders serve as a substitute source of metadata/color profiles.
Thanks for your insights @kornelski :)
This again seems to double down on the theory that a large part of the pixel complexity comes from processing. In my personal experience, the value of image-rs
is the ability to decode many formats and unify their contents, using a single crate dependency. The ability to process pixels could be done in a separate crate for all my personal projects so far, without any downside.
The decoding/encoding library could work with a rather dynamic system (an enum for pixel formats) and the imageproc library could provide a statically typed layer on top of the dynamically decoded data. This would also allow users to load obscure data (such as arbitrary exr channels such as depth and motion vectors) without the need for the decoder library to provide static types for these rare use cases.
I really think the most important step to keeping the library maintainable in long term would be to pivot and define one single responsibility: Load various image files using a unified pixel api. Currently, there seem to be two responsibilities, that exponentiate their complexity when combined (first is de/encoding and second is pixel processing). I think it will be practically impossible to come to any conclusion in this discussion without separating these two responsibilities. To go even further, I propose we shift the discussion to this topic of separating these two concerns.
The idea would be the following (up for discussion):
image
crate provides a runtime-typed container type for all the possible image data, relying on runtime mechanisms. Uses no static typing, offers no pixel processing (similar to the current byte-buffer-based ImageBuffer
). Maybe not even a DynamicImage
enumeration. Maybe offers conversion from various color formats into a specific requested format?image-proc
crate provides a completely separate image container type, using static types supporting only the most commonly used color types. Definitely includes a DynamicImage
type.While in theory, these two container types could also remain in this one crate, separating into two crates will help us to maintain a clean boundary, and not be tempted to blur the separation again.
Mingling of the imageproc with this crate I think is a separate problem. Even when all of image processing functionality is removed, this crate should still care about pixel types.
If by runtime-typed you mean returning Vec<u8>
and some color type description, then this is ill-suited for 16-bit and floating point types due to lack of alignment. It's UB to cast these to Vec<u16>
or &[f32]
, or variations of these like &[RGBA<u16>]
.
Yes, we still want some kind of Pixel type, but I still think removing processing would reduce a lot of complexity for the Pixel trait.
then a strong-typed buffer is allocated (e.g. [u16]), and then this is cast to a u8 buffer.
If you do this with Vec<u8>
, it is UB: https://doc.rust-lang.org/stable/std/alloc/trait.GlobalAlloc.html#tymethod.dealloc
layout must be the same layout that was used to allocate that block of memory.
Layout
contains alignment, and Vec<u8>
will always set it to 1.
so then, it's only a problem if you don't explicitly handle deallocation of the buffer, which would definitely be possible. If I made another mistake in my thinking, I suggest we continue this discussion elsewhere to avoid cluttering up the discussion :)
The image
crate only cast slices behind a reference, not the Vec
itself, I believe. This is sound cast. As teased, one could ensure that the buffer is re-interpreted back to its original type before it is dropped and no mutable reference is leaked out, but this is error prone and does not provide much beyond the slice cast. In comparison, image-canvas
normalized the type to a highly-aligned buffer in all cases—in the future (higher Rust version) an optimization with an aware allocator might ensure this normalization is zero-copy in most case.
Hey! I've been trying to implement f32 support for a while now, but progress always stops when I try to modify
Pixel
in a backwards-compatible and sane way. While we could probably think of some way to add f32 support in a non-breaking way, I see no possibility that doesn't drive us further and further into technical debt. I think we can't continue adding yet another hack to the pixel trait forever. Adding new features will become impossible eventually, it's already hard. This is because the original Pixel trait design is not flexible enough. I have been thinking about the Pixel type for a whole lot of time now. I've seen #1099, #956, #1464, #1212, #1013, #1499, #1516. That's why I want to call for action right now.With the upcoming
0.24.0
release planned, which allows us to do breaking changes, we finally have the chance to actually improve the design of the trait in a sustainable manner. I propose we redesign the pixel trait right now! The goal of the redesign would be to maximise the flexibility such that it will last for a long time without major modifications.Limitations of the current design
As far as I understand, these are currently the biggest problems with the pixel trait:
It is not generic enough: (#1511, #956, #1464, #1212). The purpose of a trait is to enable generic code. Implementing the pixel trait is problematic for multiple reasons.
to_rgb
is not generic. Rust has traits for conversions, theFrom
andInto
traits. These should be used, if at all. Color conversion is more complex than that though, so it should probably be separated.It has too many responsibilities: (#1511, #1099, #1013, #1212) Adding multiple responsibilities to a trait is problematic, as the flexibility is reduced further and further. In the
std
library, we can observe multiple traits that are split up atomically. Many times, traits contain only a single method definition.I identified the following responsibilities in the Pixel trait:
as_slice
,from_slice
and friends exist to access the samples in the pixel. Granting access to the sample data should be a separate responsibility, just like image operations and codecs should be separated.COLOR_TYPE: ColorType
is passed to the image encoders.invert
,blend
, but no other color operationsAll of these could theoretically be separate traits, especially
Blend
,Invert
, andto_xyz
color conversionsFurthermore, there are other minor problems:
channels4
method of the pixel trait.hue_rotate_in_place
. In reality, the hue algorithm will only make sense for Rgb. These algorithms should probably not use the pixel trait in the first place.horizontal_sample
should probably use the generic slice methods, and work with every possible number of channels. Even more idiomatic would be to actually treat the pixel as a whole, by not adding up each sample of a pixel individually. Instead, the algorithm should use hypothetical operations such asblend
orsum
oraverage
on the pixel.How to start?
Of course, a redesign is a huge task. And even settling on a design will be challenging. Splitting up the pixel into atomic traits would probably help reduce bike shedding, as color conversion and similar tasks could be added independently. To speed up the design process, it could also help to list the new requirements of the pixel trait(s).
Maybe we can start with the following list of requirements:
u8
,u16
,f32
, or evenf16
if the user wants to implement that)add
andblend
for those`?pixel.map(|f32| (f32 255.0) as u8)
?I propose we remove the color conversion from the pixel trait into a separate module. We could keep the old conversion algorithms until real color spaces are introduced.
Example Design
This is by far not a full design. It should rather be understood as an example. It shows an extreme variant of the trait separation. I just typed this off of my head, so it probably doesn't factor in every requirement yet.
The design focuses on separating concerns and providing default implementations for most methods. The code does not make any assumptions about the sample type, except for it being
Primitive
. The pixel is never assumed to be a specific color space. The only real assumption made, is that there exists an underlying array of samples. Color space models can be introduced to the crate by adding more traits, probably even without changing the existing traits.