image-rs / imageproc

Image processing operations
MIT License
723 stars 144 forks source link

Add reference filter3x3 #658

Closed theotherphil closed 1 month ago

theotherphil commented 1 month ago
test filter::benches::bench_filter_clamped_gray_3x3                                   ... bench:     704,325.00 ns/iter (+/- 27,325.96)
test filter::benches::bench_filter_clamped_gray_3x3_ref                               ... bench:     230,120.83 ns/iter (+/- 4,152.60)

The current Implementation of filter_clamped is 3x slower than a simple hardcoded implementation for a grayscale image and 3x3 kernel.

ripytide commented 1 month ago

I bet the compiler is able to do some fancy optimizations when the kernel size is hard-coded as 3. We could have an if statement in the generic filter() where if kernel.width() == 3 and kernel.height() == 3 then we call filter3x3() otherwise we just use the generic version. Then we don't need to expose two different kernel filter functions to the user.

theotherphil commented 1 month ago

Yes, we can hide some some manual specialisation behind filter as necessary.