mravanelli / SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.
MIT License
1.14k stars 263 forks source link

Problem on SER % #57

Closed Sathiyakugan closed 5 years ago

Sathiyakugan commented 5 years ago

Since the paper indicates that Sentence error rate, could you please explain about the Sentence error rate on Speaker Identification?

mravanelli commented 5 years ago

The Sentence Error Rate (SER%) is the classification error rate computed at sentence level. In practice, we take a decision based on windows of 200 ms shifted by 10 ms and we average the posterior probabilities over all these frames composing the sentence. The class with the highest average probability is the winner and it is compared with the reference ground truth. This procedure is repeated for all the sentences of the test set.

Best,

Mirco

On Tue, 6 Aug 2019 at 09:18, Sathiyakugan notifications@github.com wrote:

Since the paper indicates that Sentence error rate could you please explain about the Sentence error rate on Speaker Identification ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mravanelli/SincNet/issues/57?email_source=notifications&email_token=AEA2ZVRTXS7DPOTBNBQE3EDQDF22NA5CNFSM4IJWIRH2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HDUF5CA, or mute the thread https://github.com/notifications/unsubscribe-auth/AEA2ZVSYL5MTMMDZ4DGBG3TQDF22NANCNFSM4IJWIRHQ .