wenet-e2e / WenetSpeech

A 10000+ hours dataset for Chinese speech recognition
Apache License 2.0
506 stars 49 forks source link

The mismatch between the marked duration and the actual audio duration. #33

Closed luomingshuang closed 2 years ago

luomingshuang commented 2 years ago

I am using k2 and Lhotse for wenetspeech ASR experiments. But there is an error happened. The error shows as follows: image

And then I check the actual duration for this sample (its marked duration is 786.44s):

5305fba8604afa0e9cbb3a3ede5903f

I find the marked duration is 988.89s.

6675f17edaacbd75ec52064adb7de80

So can we change the marked duration in the original marked transcripts? Or I should filter it with a filtering function to avoid this error?

robin1001 commented 2 years ago

I'm not familiar with K2. I think you can get the real duraiton by the audio if there is a mismatch.

luomingshuang commented 2 years ago

Thanks.

---Original--- From: "Binbin @.> Date: Thu, Apr 14, 2022 11:57 AM To: @.>; Cc: "Mingshuang @.**@.>; Subject: Re: [wenet-e2e/WenetSpeech] The mismatch between the marked durationand the actual audio duration. (Issue #33)

I'm not familiar with K2. I think you can get the real duraiton by the audio if there is a mismatch.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>