fakufaku / fast_bss_eval

A fast implementation of bss_eval metrics for blind source separation

https://fast-bss-eval.readthedocs.io/en/latest/

MIT License

130 stars 8 forks source link

Adds pytorch version of Hungarian algorithm #2

Closed fakufaku closed 2 years ago

fakufaku commented 2 years ago

This removes the necessity of brining back the loss to CPU during PIT.

codecov[bot] commented 2 years ago

Codecov Report

Merging #2 (237f4cc) into main (672dfb7) will increase coverage by 1.42%. The diff coverage is 91.81%.

@@            Coverage Diff             @@
##             main       #2      +/-   ##
==========================================
+ Coverage   82.25%   83.67%   +1.42%     
==========================================
  Files          11       12       +1     
  Lines         772      870      +98     
  Branches      103      117      +14     
==========================================
+ Hits          635      728      +93     
- Misses         90       92       +2     
- Partials       47       50       +3

Impacted Files	Coverage Δ
fast_bss_eval/torch/helpers.py	`62.29% <60.00%> (-1.20%)`	:arrow_down:
fast_bss_eval/torch/hungarian.py	`95.00% <95.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 672dfb7...237f4cc. Read the comment docs.

quancs commented 2 years ago

Great work! I was doing this translation when implementing PIT, but I failed, because my translation works very slow. So I use a naive implementation when the speaker number is small. How is your speed on GPU compared with the scipy version? ^^

fakufaku commented 2 years ago

Hey @quancs ! The translation is quite slow compared to scipy '^_^ Scipy is very fast because it is implemented in C and not in python. Nevertheless, using scipy means we need to bring back tensors on cpu for evaluation, which may cause performance issue especially for multi-GPU training. Even the "slow" python version should not be that slow compared to actual separation/computation of metrics I think, although I have not carefully measured. What is your experience there ?

I think the performance issue could be fixed by implementing this as a CPP extension via pytorch.

quancs commented 2 years ago

@fakufaku My translation on GPU seems to be slower than the scipy version on my personal computer.

I guess the slowness originated from the algorithm itself, as there are so many judgements, which is not the thing GPUs good at. So, to speed up, my idea is that we need an algorithm that could solve the assignment problem involving more vector and matrix operations.

Your idea should work, as the judgements are performed on python not cpp.