Closed amant555 closed 4 years ago
does adding --normalize
help
Yes, I suspect that state["cfg"]["model"]
is correct, since we recently migrated to Hydra configuration which uses "cfg" instead of "args". Can you please submit a PR with the fix?
Hi, @myleott I have submitted the PR with the fix.
@alexeib small question
In the fine-tuning with LM. I am getting Error in `python': corrupted double-linked list: 0x000055d8f2c35210
. Attached the full traceback LM_error_traceback.txt .
This LM is created using kenlm with bin/lmplz -o 5 <text >text.arpa
> bin/build_binary trie text.arpa text.binary
And I have tried one without trie
too.
Can you see and tell me what I am doing wrong and suggest a possible fix?
@amant555 thx for the fix, i added a comment to your pr re: your crash, i see there is a warning printed at the very top to install wav2letter bindings. did you do this? if you go to python repl and type "import wav2letter", does it work?
Yeah import wav2letter
works in repl. Warning is because code is not able to import LexiconFreeDecoder from wav2letter. As its not being used by kenlm so I think that won't be the reason for error, right?
can you comment out LexiconFreeDecoder import then?
I did that too. But error didn't change.
@alexeib small question In the fine-tuning with LM. I am getting
Error in `python': corrupted double-linked list: 0x000055d8f2c35210
. Attached the full traceback LM_error_traceback.txt . This LM is created using kenlm withbin/lmplz -o 5 <text >text.arpa
>bin/build_binary trie text.arpa text.binary
And I have tried one withouttrie
too. Can you see and tell me what I am doing wrong and suggest a possible fix?
happens to me too...
I already investigate it, it happens when this script execute in line 156.
it an iteration of lexicon word index, the spelling index and the scorer, and while iterating, the script order to input all of that to the trie with this command:
self.trie.insert(spelling_idxs, word_idx, score)
the error occurs when that script execute. It's not because of the word or the spelling in the lexicon.txt
please read this, to understood what happens... It's clear not because of the words/spelling in the lexicon, it's because of something else...
because some times, a words / spelling, can inserting to the trie, and in other times, it's cannot.
I already try to refresh the fairseq to v0.10.0, problems still occurs...
it's double error reported, sometimes segmentation fault (core dump)
, some times(corrupted double-linked list)
..
or memory problem?
@wahyubram82 were you able to figure out what the issue was? I'm running into the same problem here
@pkadambi may be your wav2letter installation was the culprit. See https://github.com/pytorch/fairseq/issues/2493#issuecomment-755035532.
🐛 Bug
I have pre-trained the model and when I used the best checkpoint to fine-tune it. I received the given error.
After further investigating the code in wav2vec2_asr.py, I found that the state is being fetched from args in saved checkpoint. Which has none value. And its same for all the checkpoints that are generated. I have also analysed the checkpoint given in Readme on wav2vec page. The args values in that are properly populated. Also, there is an additional key value (cfg) in newly generated checkpoints.
Code start working again once I change the state that is being fetched from state["args"] to state["cfg"]["model"] in wav2vec2_asr.py
Complete error traceback
Command used for FINE-TUNING
Command used for PRE-TRAINING
Questions