Closed tongda closed 7 years ago
TED-LIUM release 1 is not currently supported
If you want to file an enhancement issue or file and enhancement issue and make the associated pull request, both would be more than welcome. (Creating such a pull request likely will not be too difficult as one can start with the current ted importer then make the required changes.)
@kdavis-mozilla Thanks. I am working on making DeepSpeech compatible with TED-LIUM release 1. I thought they were same structure, while it turned out they are not. Do you have any clue of what is the difference between release 1 and 2, please?
@tongda Actually I don't know how the structures differ.
When I was originally working on the importer I only looked at the TED 2 data set. I assume, though it may not be the case, that they are not too different.
Hello @tongda, Actually I am getting the same issue while trying to run DeepSpeech files with ted-lium release1 dataset. Did you get a solution by any chance on this? Thank you.
@ApexNDSU No progress. I just change to release 2 dataset because lack of time. But I hope to look into it when I am free.
Hi there, author of TEDLIUM 1&2 here. TBH, I don't really see the point in wanting to use only release 1 when you have release 2 which contains everything release 1 has to offer with even more talks in it. And to complete the answer, there is no real structural difference between releases 1&2, besides more talks, so IMHO the root cause of your issue lies elsewhere. The only difference maybe is the lack of fillers in release 2 compared to release 1, I don't remember exactly. BTW, if you want, you can extract the release 1 talks from release 2 and give it a try, since release 2 works with this repo. As per your issue, in the example you pasted, for me it smells like "n_targets > n_frames".
Thank you both for your reply. For me the reason to prefer release 1 was because of its smaller size. I was getting OOM error with release 2 and reducing the batch size will take even longer time to train. I will try your option @antho-rousseau. Thank you!
@ApexNDSU The OOM can be removed by using a smaller batch size. The OOM is independent of the size of the corpus, TED v1 vs TED v2.
Training time is dependent upon corpus size. However, you can use the command line parameters[1]
limit_train
- maximum number of elements to use from train set - 0 means no limitlimit_dev
- maximum number of elements to use from validation set- 0 means no limitlimit_test
- maximum number of elements to use from test set- 0 means no limitto use only a subset of TED v2
Yes,I can limit the corpus size. Thank you so much @kdavis-mozilla Just a quick concern. If I do so, i won't have a WER to compare and check with right? Also may I ask the WER for Ted v2 that you have got?
@ApexNDSU In limiting the corpus size you can compute WER as before.
However, WER comparisons with other full TED v2 runs will unfortunately be an apples vs oranges comparison.
We've done preliminary TED v2 WER work training on the full TED v2 training data set and got WER's in the mid 20%. But we didn't take much time to tune our language model, which needs work, or to use multiple decoder results.
Agree. Thank you so much for the clarity @kdavis-mozilla
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
I changed the corpus from TED-LIUM release 2 to release 1. When the training has just started, I got exception like this:
As I can see, the batch shape is
[16, 59]
, but the exception said the index should be less than[16, 6]
.I am not familiar with
ctc_loss
, can anyone explain a little about what happened here for me, please?BTW, The code can run properly on TED-LIUM release 2 corpus.