mravanelli / SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.
MIT License
1.13k stars 262 forks source link

Any class detector #72

Closed CarmiShimon closed 4 years ago

CarmiShimon commented 4 years ago

Hi Mirco,

I wonder if you have in mind such an idea of recognizing "noise" or "non speech" like garbage collector which will work well for any other class instead of using VAD?

Thanks a lot, Carmi

mravanelli commented 4 years ago

HI Carmi, this is for sure possible. As far as I know some people have already used SincNet for speech activity detection (but I have to double check better). I thus thing it could make sense to add this garbage class.

hbredin commented 4 years ago

I've used SincNet successfully for speech activity detection, speaker change detection, and overlapped speech detection. See this paper.

CarmiShimon commented 4 years ago

Thanks a lot!