worldveil / dejavu

Audio fingerprinting and recognition in Python
MIT License
6.44k stars 1.44k forks source link

Do maximum_filter with cupy instead of scipy #285

Open DeltaFlyerW opened 1 year ago

DeltaFlyerW commented 1 year ago

https://github.com/worldveil/dejavu/blob/d2b8761eb39f8e2479503f936e9f9948addea8ea/dejavu/fingerprint.py#L98

Replace the above codes with following codes.

import cupy as cp
from cupyx.scipy.ndimage import maximum_filter as cp_maximum_filter
array = cp.array(arr2D)
local_max = cp.asnumpy(cp_maximum_filter(array, footprint=cp.array(neighborhood)) == array)
del array
Single Channel length Audio Duration cupy scipy
62622441 24 min 45.92s 49.69s
31311220 12 min 10.6s 25.68s
15655610 6 min 1.56s 12.62s
7827805 3 min 0.72s 6.18s
3913902 1.5 min 0.34s 3.09s

Environment: Ryzen 7 5800H, RTX3060 6G Mobile, Windows 10 21H2, Python 3.7.9, CUDA 11.0, cupy-cuda110

Which means that cupy will save up to 10 seconds when processing a dual-channel audio for 3 minutes. Meanwhile it will cost up to 1.5GB of video memory for an audio for 24 minutes. https://cupy.dev/