Closed shreeshailgan closed 3 months ago
The second, things like \
You can see the current version of the script for creating the benchmark dataset here: https://github.com/MontrealCorpusTools/mfa-models/blob/main/scripts/alignment_benchmarks/data_prep/create_buckeye_benchmark.py
In the 2017 InterSpeech paper, section 3.1: Datasets includes the following sentence:
I am confused here. What is the >150 ms filtering applied to?
1] the length of the chunks (end _time of last token - start_time of first token in the chunk) i.e., chunks with duration < 150 msec are discarded
OR
2] the non-speech tokens that are used to separate the chunks i.e, non-speech tokens with duration < 150 msec are not used to split the ongoing chunk, instead it is included and we continue.