qarmin / img_hash

A Rust library for calculating perceptual hash values of images
Apache License 2.0
27 stars 11 forks source link

Faster resize implementation #4

Open qarmin opened 1 year ago

qarmin commented 1 year ago
    @qarmin Sorry to say something so completely off-topic but, with issues turned off and no apparent contact options enabled on your profile...

I find image::imageops::sample::resize taking up 45% of my flamegraph using this. The fact that you've disabled issues suggests, to me, that you may be trying to funnel feature patches back to the upstream project... but I actually want something where there's an apparent chance of getting my contributions released on crates.io.

If I make time to add support for faster implementations like resize or fast_image_resize, would you prefer I open a PR here or publish a fork of your fork?

_Originally posted by @ssokolow in https://github.com/qarmin/img_hash/issues/3#issuecomment-1317917894_

qarmin commented 1 year ago

@ssokolow performance improvements, especially with fast_image_resize looks very promising so if crate will still be easy to use without major changes to the public api I will happily merge this commit

ssokolow commented 1 year ago

OK. I first need to do my due diligence to audit fast_image_resize (within what I'm capable of), which may take some time, but I'll put it on the horizon.

I'll probably experiment with designs where there's a method you can call to set a custom resize callback since that'd give people freedom to use any resize implementation they want without getting upstream approval while not changing the experience or dependency list for existing use-cases.

(It'd also avoid (or at least pass the buck on) the risk of making it difficult/impossible to play with the feature flags to support situations like my AMD CPU from 2012 that doesn't have either of those SIMD ISA extensions since it'd be the integrating crate that'd be handling the dependency specification for the resizing crate rather than img_hash.)

agausmann commented 8 months ago

Looking into this myself, because resizing from a 1080p video was taking up ~45% of my runtime.

I tried scaling the images before passing them to image_hasher, and got good results performance-wise - image_hasher dropped to 0.3% for 32x32 prescale, and 0.1% for 8x8 (I assumed that's the default hasher config), and overall performance was definitely much faster.

In my case, I was reading frames using ffmpeg_next and already using their "scaling context" to change the pixel format to grayscale. All I had to do was change the output resolution too. In theory, this should also apply to any other resize algorithm like fast_image_resize.

It does seem to affect the results of the hash. I'm still very new to image hashing; hopefully I'll have some concrete data to share once I get more experience with this.