Closed danijel3 closed 8 years ago
These labels were generated using a GMM system based on the MFCC features. So you should consider those mainly as examples. As you have observed, due to the way feature extraction is conducted by different tools, there might be mismatch in number of frames as well.
We encourage users to push their config, ndl, script, and models to this repository. In the near future you would see setups that would generate state-of-the-art results on different tasks.
I was really thrilled to see some behind-the-scenes data posted on the TIMIT corpus. Many papers publish sequence results, but no one posts actual state-level alignments or bigram models they use during decoding. It's annoying since it makes it difficult to reproduce exact results on those papers (e.g. by Hinton and many others following him).
Nevertheless, I tried using the MLF files from this repo in some of my experiments and couldn't reproduce state-of-the-art results on simple things like MLP framewise classification. Turns out that the MLFs don't actually match the hand-made alignments provided in the corpus. Is this intended like that? If it's true, is it correct that I cannot use these labels for framewise classification so its comparable with others? Also, do you use the "core test" MLFs anywhere or do you simply compare the sequences, discarding the time information?