Audio-WestlakeU / NBSS

The official repo of NBC & SpatialNet for multichannel speech separation, denoising, and dereverberation
MIT License
232 stars 26 forks source link

will you combine it with speaker embedding? #32

Open wendongj opened 5 months ago

wendongj commented 5 months ago
many target speaker extraction is for single channel, multi-channel target speaker extraction is less researched. and many target speaker extraction network is time domain and performance is poor under real world reverberation recordings. 

SpatialNet performance is well under real world reverberation recordings, so, I just wonder if you will combine it with speaker embedding? 

I try to combine SptialNet with speaker embedding, the result is not good in real multichannel recordings, I just replace the bottleneck in PEA-TSE 3.0 with SptialNet structure.
quancs commented 5 months ago

SpatialNet performance is well under real world reverberation recordings, so, I just wonder if you will combine it with speaker embedding?

We don't research how to combine it with speaker embedding currently. You can try it.

I try to combine SptialNet with speaker embedding, the result is not good in real multichannel recordings, I just replace the bottleneck in PEA-TSE 3.0 with SptialNet structure.

We don't have experiences on the target speaker extraction task, so we don't know how SpatialNet performs on that task.

wendongj commented 5 months ago

SpatialNet performance is well under real world reverberation recordings, so, I just wonder if you will combine it with speaker embedding?

We don't research how to combine it with speaker embedding currently. You can try it.

I try to combine SptialNet with speaker embedding, the result is not good in real multichannel recordings, I just replace the bottleneck in PEA-TSE 3.0 with SptialNet structure.

We don't have experiences on the target speaker extraction task, so we don't know how SpatialNet performs on that task.

I see, thanks for your reply ^_^