Open lilith opened 2 months ago
Hi, apologies for the delay in response.
Robust and safe for use on web-servers, in process We align in priorities there, but we don't use failable allocations yet, but that can be added.
Be correct: especially about color profiles, linear-light image resampling, image scaling weight calculation, border pixels, minimize jpeg decoding/encoding artifacts, etc Priorities align, but most of them aren't included yet, I.e you can write a pipeline to do linear light image sampling by stitching the components together, the jpeg decoding artifacts is almost complete once I switch to a more accurate color upsampler.
Optimize image compression for web use Align: The problem is I haven't really written a complex compressor, (PNG maybe?, but it's still not yet ready), I have plans maybe in the future for a better jxl encoder, a webp one and maybe (far fetched) avif
Allow for end-to-end optimizations that aren't possible with an imperative API Such increase complexity so I tend to deal with them on a case to case basis.
Sustainability Maybe look into ways to get paid for maintenance, e.g something like https://www.sovereigntechfund.de/ or https://nlnet.nl/project/libvips/ , also github sponsors is another way, but this would be more viable if the library gains traction
Robust ABI & bindings for multiple languages (including C#, Node, C)
The architecture of zune-image makes abi binding easy since filters are implemented as consumers of an image and not extenders , meaning Brigthen(new_value).execute(image)?
instead of image.brighten(new_value
) which means less breaks in the core api,
Cross-platform, cross-architecture support. I have to support Windows, Linux, and macOS on x86_64 and aarch64, in that order, but also 32-bit x86 on Windows (for now, hopefully 32-bit dies soon). WASM is also a goal, but not yet something Imageflow targets in CI.
My reccomendation would be to depend on core image libraries and write the glue your own way,
There exist quite large differences to how zune-image
does processing, e.g I chose to process images in planar (RRRR,GGG,BBB) instead of interleaved (R,G,B) to make operations easier, but this means we have to decode the whole image before processing, if end-to-end latency is of utmost priority an interleaved architecture may work better, feeding pixels into the pipeline as they are decoded.
Operations can be easily ported to support that, as they are written as functions that work on one channel and it makes it better since you can include only operations imageflow supports
It's great to hear how much overlap there is in our goals!
Imageflow resolves the operation graph to imperative instructions, so there's no need for underlying functions to be graph-based.
And it not like I can't use my own encoding logic.
A couple questions on performance:
Downscaling during IDCT is really useful when you can verify the signal loss is insignificant and are doing a later sufficiently larger factor downscale. I generated C code for a bunch of kernels and brute force tested them for DSSIM impact, then injected it into the jpeg decoder. 8x8->nxn SIMD kernels are really fast. Have you looked into doing anything like that?
Premultiplication of the alpha channel is essential prior to downsampling, or you can get extraordinary artifacts. Have you noticed a penalty for planar memory layout when premultiplying and reversing it? I haven't run benchmarks, and would love to know what the impact is when doing compositing/resampling in planar mode.
Imageflow current works on entire image frames, and doesn't support streaming or region-tiled operations. That said, it definitely limits the upper bound of image dimensions and is really problematic in constrained memory situations. I initially made this choise due to speed/cache benefits, predictability (don't want to stall out indefinitely for I/O reasons with big buffers in play), and how complex the ABI/FFI interface gets if you want to support async across language barriers. That said, libvips is proof that you can have both speed and low memory impact - something relevant for serverless function hosting limitations.
What I haven't tested is how broadly image encoders are affected by streaming vs whole image. Final image file size is king, and some optimizations need to review all the data first.
I'm the author of imageflow. It's a (much older) project with similar goals (secure, correct, fast in that order), and in an ideal world we would find ways to share code and effort and collaboration. With the prevalence of gain maps, avif, jxl, and HDR, I'm looking at a rewrite soon.
In that line, I'm wondering about what might be goals and non-goals for this project. Here are some criteria I had for Imageflow
Knowing where our goals align would be pretty great; if there are areas that I can contribute Imageflow's unique advanages to zune-image and then merge a crate or two, it would let me write more useful features. Fast AI 4x upscaling and AI auto-enhancement, salience detection, etc, are all things I'd love to make happen, but I can't allocate that kind of effort while also juggling all the other aspects of my end-to-end solution.