Closed fakufaku closed 2 years ago
Merging #2 (237f4cc) into main (672dfb7) will increase coverage by
1.42%
. The diff coverage is91.81%
.
@@ Coverage Diff @@
## main #2 +/- ##
==========================================
+ Coverage 82.25% 83.67% +1.42%
==========================================
Files 11 12 +1
Lines 772 870 +98
Branches 103 117 +14
==========================================
+ Hits 635 728 +93
- Misses 90 92 +2
- Partials 47 50 +3
Impacted Files | Coverage Δ | |
---|---|---|
fast_bss_eval/torch/helpers.py | 62.29% <60.00%> (-1.20%) |
:arrow_down: |
fast_bss_eval/torch/hungarian.py | 95.00% <95.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 672dfb7...237f4cc. Read the comment docs.
Great work! I was doing this translation when implementing PIT, but I failed, because my translation works very slow. So I use a naive implementation when the speaker number is small. How is your speed on GPU compared with the scipy version? ^^
Hey @quancs ! The translation is quite slow compared to scipy '^_^ Scipy is very fast because it is implemented in C and not in python. Nevertheless, using scipy means we need to bring back tensors on cpu for evaluation, which may cause performance issue especially for multi-GPU training. Even the "slow" python version should not be that slow compared to actual separation/computation of metrics I think, although I have not carefully measured. What is your experience there ?
I think the performance issue could be fixed by implementing this as a CPP extension via pytorch.
@fakufaku My translation on GPU seems to be slower than the scipy version on my personal computer.
I guess the slowness originated from the algorithm itself, as there are so many judgements, which is not the thing GPUs good at. So, to speed up, my idea is that we need an algorithm that could solve the assignment problem involving more vector and matrix operations.
Your idea should work, as the judgements are performed on python not cpp.
This removes the necessity of brining back the loss to CPU during PIT.