RicherMans / GPV

Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
https://arxiv.org/abs/2003.12222
GNU General Public License v3.0
142 stars 29 forks source link

how did label encoder model get? #10

Closed Pydataman closed 3 years ago

RicherMans commented 3 years ago

Sorry, what ?

Pydataman commented 3 years ago

@RicherMans
there are some models in gpv/label_encoders/, I read paper but I can not find description of those models

RicherMans commented 3 years ago

The paper compares 3 models. All models share the exact same back-end as seen here. The VAD-C and GPV-B models use 2 output neurons, while GPV-F uses 527. The training data for VAD-C uses Aurora4 ( with some additional noise, as described in the paper), while GPV-B/F uses the balanced Audioset subset.

Pydataman commented 3 years ago

The paper compares 3 models. All models share the exact same back-end as seen here. The VAD-C and GPV-B models use 2 output neurons, while GPV-F uses 527. The training data for VAD-C uses Aurora4 ( with some additional noise, as described in the paper), while GPV-B/F uses the balanced Audioset subset.

oh I know thank you

Pydataman commented 3 years ago

@RicherMans I know thank you