taylorlu / Speaker-Diarization

speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
Apache License 2.0
464 stars 121 forks source link

Question about using dvector created by VGG to train UISRNN #42

Open mengjie-du opened 3 years ago

mengjie-du commented 3 years ago

After running the whole project, I revised the procedure and both paper from VGG and UISRNN, and noticed that in UISRNN Google has a basic assumption where embeddings generated by RNN. However, VGG generates fixed length embeddings with CNN. Could you tell me, whether the usage of CNN embeddings break UISRNN assumption, if not, please tell me why.Thanks a lot!