Open awused opened 2 years ago
Adding an optional as_flat_samples{,_mut}
, and using that for optimization, seems very agreeable to me. A very convenient way to speed up a few of other operations as well. The imageproc
crate had measured 30% improvement, in some methods, for just specializing on ImageBuffer
's methods instead of the generic accessors.
Right now SubImage uses get_pixel and put_pixel for every individual pixel, this is likely not easy for the compiler to eliminate bounds checking on due to the arithmetic involved, though I could be wrong. Switching this to copy_from_slice should be possible.
This should be a straight increase to performance, especially for operations like DynamicImage's
crop()
which are unexpectedly slow right now. While I haven't made the changes locally to test them I saw 80-90ms improvements when changing another tool (maim) from setting individual channels to memcpy at the resolutions I am personally dealing with. Though the context and operations were different, that's about the order of magnitude of improvements I'd expect. That was a C++ project operating on arrays without bounds checking, so I'd potentially expect more savings.Draft
If I were implementing it myself I'd want to expose as_flat_samples methods from DynamicImage on GenericImageView, then it's easy to convert the existing nested for loops into single for loops with
copy_from_slice
calls.Personally I would just add methods to wrap DynamicImage's as_flat_samples methods to GenericImageView, give them default implementations returning None, and then in the implementation for DynamicImage just forward them to the existing methods. The SubImage code can fall back to the get_pixel/set_pixel code if the flat samples are None.