google / uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
https://arxiv.org/abs/1810.04719
Apache License 2.0
1.55k stars 320 forks source link

question about the sigma2 prior loss increasing during the training #60

Closed maxandchen closed 4 years ago

maxandchen commented 4 years ago

Describe the question

A clear and concise description of what the question is. a TDNN model is trained to extract embedding called x-vector ,so i use x-vector instead of d-vector . During the uis-rnn training ,my sigma2 prior loss keep increasing although the traing loss is decreasing , I wonder if this is abnormal ? image

My background

Have I read the README.md file?

Have I searched for similar questions from closed issues?

Have I tried to find the answers in the paper Fully Supervised Speaker Diarization?

Have I tried to find the answers in the reference Speaker Diarization with LSTM?

Have I tried to find the answers in the reference Generalized End-to-End Loss for Speaker Verification?

wq2012 commented 4 years ago

@maxandchen d-vector and x-vector are basically the same thing, just being given different names by Google and JHU respectively, using slightly different network topology, data augmentation, and loss functions. Shouldn't make much difference.

I think the increasing sigma2 prior loss is very normal. It's just a regularization term. Sacrificing regularization constraints to better fit data is quite common.

wq2012 commented 4 years ago

@maxandchen BTW does your x-vector model use LSTM/GRU or purely feed-forward network? Feed-forward network usually has much much worse performance than LSTM/GRU in speaker recognition.