winddori2002 / TriAAN-VC

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
MIT License
129 stars 12 forks source link

What is st and ut? #8

Closed Souvic closed 1 year ago

Souvic commented 1 year ago

I am getting these scores on the test set if trimming is not done after 500 epochs.

--- Set: test --- CER: | s2s_st: 0.1787 | s2s_ut: 0.1920 | u2u_st: 0.1717 | u2u_ut: 0.1766 WER: | s2s_st: 0.3156 | s2s_ut: 0.3644 | u2u_st: 0.3217 | u2u_ut: 0.2973 ASV ACC: | s2s_st: 0.9500 | s2s_ut: 0.9600 | u2u_st: 0.9150 | u2u_ut: 0.9550 ASV COS: | s2s_st: 0.7993 | s2s_ut: 0.7960 | u2u_st: 0.7859 | u2u_ut: 0.7878

Is this ok, or should I rerun it several times to get to your scores because of the inherent uncertainty?

How do I calculate the average scores here? Specifically, what are s2sst and s2sut? s2s is already 'seen to seen' right?

Souvic commented 1 year ago

okay, seen text and unseen text I guess..

winddori2002 commented 1 year ago

That's right..!