facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
29.87k stars 6.33k forks source link

Failed to run prepare_text.py for wav2vec-U #3662

Closed soroushhashemifar closed 3 years ago

soroushhashemifar commented 3 years ago

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

I have followed the steps described in here, but when it comes to prepare_text.py I face these errors:

vi
vi
/content/corpus.txt
/content/output_dir
min phone seen threshold is 4
Warning : `load_model` does not return WordVectorModel or SupervisedModel any more, but a `FastText` object which is very similar.
2021-06-29 08:22:14 | INFO | fairseq_cli.preprocess | Namespace(align_suffix=None, alignfile=None, all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, azureml_logging=False, bf16=False, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='/content/output_dir', dict_only=True, empty_cache_freq=0, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_file=None, log_format=None, log_interval=100, lr_scheduler='fixed', memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, model_parallel_size=1, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, on_cpu_convert_precision=False, only_source=True, optimizer=None, padding_factor=1, plasma_path='/tmp/plasma', profile=False, quantization_config_path=None, reset_logging=False, scoring='bleu', seed=1, source_lang=None, srcdict=None, suppress_crashes=False, target_lang=None, task='translation', tensorboard_logdir=None, testpref=None, tgtdict=None, threshold_loss_scale=None, thresholdsrc=2, thresholdtgt=0, tokenizer=None, tpu=False, trainpref='/content/output_dir/lm.upper.lid.txt', use_plasma_view=False, user_dir=None, validpref=None, wandb_project=None, workers=1)
[WARNING] 30 utterances containing language switches on lines 123, 363, 397, 429, 473, 498, 563, 608, 669, 718, 728, 868, 926, 928, 935, 956, 1048, 1050, 1063, 1067, 1109, 1191, 1202, 1222, 1394, 1404, 1416, 1452, 1463, 1607
[WARNING] extra phones may appear in the "vi" phoneset
[WARNING] language switch flags have been removed (applying "remove-flags" policy)
one is m o6 t̪  
2021-06-29 08:22:50 | INFO | fairseq_cli.preprocess | Namespace(align_suffix=None, alignfile=None, all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, azureml_logging=False, bf16=False, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='/content/output_dir/phones', dict_only=True, empty_cache_freq=0, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_file=None, log_format=None, log_interval=100, lr_scheduler='fixed', memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, model_parallel_size=1, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, on_cpu_convert_precision=False, only_source=True, optimizer=None, padding_factor=1, plasma_path='/tmp/plasma', profile=False, quantization_config_path=None, reset_logging=False, scoring='bleu', seed=1, source_lang=None, srcdict=None, suppress_crashes=False, target_lang=None, task='translation', tensorboard_logdir=None, testpref=None, tgtdict=None, threshold_loss_scale=None, thresholdsrc=4, thresholdtgt=0, tokenizer=None, tpu=False, trainpref='/content/output_dir/phones.txt', use_plasma_view=False, user_dir=None, validpref=None, wandb_project=None, workers=1)
2021-06-29 08:22:53 | INFO | fairseq_cli.preprocess | Namespace(align_suffix=None, alignfile=None, all_gather_list_size=16384, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, azureml_logging=False, bf16=False, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='/content/output_dir/phones', dict_only=False, empty_cache_freq=0, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_file=None, log_format=None, log_interval=100, lr_scheduler='fixed', memory_efficient_bf16=False, memory_efficient_fp16=False, min_loss_scale=0.0001, model_parallel_size=1, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, on_cpu_convert_precision=False, only_source=True, optimizer=None, padding_factor=8, plasma_path='/tmp/plasma', profile=False, quantization_config_path=None, reset_logging=False, scoring='bleu', seed=1, source_lang=None, srcdict='/content/output_dir/phones/dict.phn.txt', suppress_crashes=False, target_lang=None, task='translation', tensorboard_logdir=None, testpref=None, tgtdict=None, threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenizer=None, tpu=False, trainpref='/content/output_dir/phones/lm.phones.filtered.txt', use_plasma_view=False, user_dir=None, validpref=None, wandb_project=None, workers=70)
2021-06-29 08:22:53 | INFO | fairseq_cli.preprocess | [None] Dictionary: 151 types
2021-06-29 08:22:54 | INFO | fairseq_cli.preprocess | [None] /content/output_dir/phones/lm.phones.filtered.txt: 176 sents, 11472 tokens, 0.0% replaced by <unk>
2021-06-29 08:22:54 | INFO | fairseq_cli.preprocess | Wrote preprocessed data to /content/output_dir/phones
=== 1/5 Counting and sorting n-grams ===
Reading /content/output_dir/lm.upper.lid.txt
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
tcmalloc: large alloc 2174918656 bytes == 0x559529f22000 @  0x7f2cafe271e7 0x5595276b07b2 0x55952764b52e 0x55952762a2fb 0x559527616076 0x7f2caeb9ebf7 0x559527617bba
tcmalloc: large alloc 8699674624 bytes == 0x5595ab94c000 @  0x7f2cafe271e7 0x5595276b07b2 0x55952769f7da 0x5595276a0218 0x55952762a318 0x559527616076 0x7f2caeb9ebf7 0x559527617bba
****************************************************************************************************
Unigram tokens 34613 types 2552
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:30624 2:1855272448 3:3478636032 4:5565817856
tcmalloc: large alloc 5565825024 bytes == 0x559529f22000 @  0x7f2cafe271e7 0x5595276b07b2 0x55952769f7da 0x5595276a0218 0x55952762a8e7 0x559527616076 0x7f2caeb9ebf7 0x559527617bba
tcmalloc: large alloc 1855275008 bytes == 0x559675b2e000 @  0x7f2cafe271e7 0x5595276b07b2 0x55952769f7da 0x5595276a0218 0x55952762aced 0x559527616076 0x7f2caeb9ebf7 0x559527617bba
tcmalloc: large alloc 3478642688 bytes == 0x5597b2a98000 @  0x7f2cafe271e7 0x5595276b07b2 0x55952769f7da 0x5595276a0218 0x55952762aced 0x559527616076 0x7f2caeb9ebf7 0x559527617bba
Statistics:
1 2552 D1=0.600838 D2=1.03014 D3+=1.4173
2 16178 D1=0.785762 D2=1.29175 D3+=1.39604
3 23650 D1=0.880488 D2=1.4018 D3+=1.85431
4 849/26038 D1=0.854156 D2=1.11662 D3+=1.01682
Memory estimate for binary LM:
type      kB
probing 1013 assuming -p 1.5
probing 1256 assuming -r models -p 1.5
trie     487 without quantization
trie     261 assuming -q 8 -b 8 quantization 
trie     449 assuming -a 22 array pointer compression
trie     224 assuming -a 22 -q 8 -b 8 array pointer compression and quantization
=== 3/5 Calculating and sorting initial probabilities ===
Chain sizes: 1:30624 2:258848 3:473000 4:20376
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
***#################################################################################################
=== 4/5 Calculating and writing order-interpolated probabilities ===
Chain sizes: 1:30624 2:258848 3:473000 4:20376
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
####################################################################################################
=== 5/5 Writing ARPA model ===
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Name:lmplz  VmPeak:14372656 kB  VmRSS:2147092 kB    RSSMax:2151332 kB   user:0.23438    sys:0.985396    CPU:1.21982 real:1.2154
Reading /content/output_dir/kenlm.wrd.o40003.arpa
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
SUCCESS
[2021-06-29 08:22:57,374][__main__][INFO] - Creating /content/output_dir/fst/phn_to_words_sil/kaldi_dict.phn.txt
[2021-06-29 08:22:57,374][__main__][INFO] - Creating /content/output_dir/fst/phn_to_words_sil/G_kenlm.wrd.o40003.fst
/content/kaldi/src/lmbin/arpa2fst --disambig-symbol=#0 --write-symbol-table=/content/output_dir/fst/phn_to_words_sil/kaldi_dict.kenlm.wrd.o40003.txt /content/output_dir/kenlm.wrd.o40003.arpa /content/output_dir/fst/phn_to_words_sil/G_kenlm.wrd.o40003.fst 
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \4-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:RemoveRedundantStates():arpa-lm-compiler.cc:359) Reduced num-states from 41301 to 19109
[2021-06-29 08:22:57,918][__main__][INFO] - Creating /content/output_dir/fst/phn_to_words_sil/kaldi_lexicon.phn.kenlm.wrd.o40003.txt (in units file: /content/output_dir/fst/phn_to_words_sil/kaldi_dict.phn.txt)
[2021-06-29 08:22:57,985][__main__][INFO] - Creating /content/output_dir/fst/phn_to_words_sil/H.phn.fst
[2021-06-29 08:22:58,848][__main__][INFO] - Creating /content/output_dir/fst/phn_to_words_sil/L.phn.kenlm.wrd.o40003.fst (in units: /content/output_dir/fst/phn_to_words_sil/kaldi_dict.phn_disambig.txt)
[2021-06-29 08:22:59,014][__main__][INFO] - Creating /content/output_dir/fst/phn_to_words_sil/LG.phn.kenlm.wrd.o40003.fst
[2021-06-29 08:22:59,731][__main__][INFO] - Creating /content/output_dir/fst/phn_to_words_sil/HLGa.phn.kenlm.wrd.o40003.fst
[2021-06-29 08:23:01,075][__main__][INFO] - Creating /content/output_dir/fst/phn_to_words_sil/HLG.phn.kenlm.wrd.o40003.fst
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc: In function ‘int32 {anonymous}::AddSelfLoopsSimple(fst::StdVectorFst*)’:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:31:47: error: no match for ‘operator<<’ (operand types are ‘kaldi::MessageLogger’ and ‘<unresolved overloaded function type>’)
             << num_states_after << " states " << std::endl;

In file included from /content/kaldi/src/base/kaldi-common.h:35:0,
                 from /content/kaldi/src/util/stl-utils.h:33,
                 from /content/kaldi/src/util/const-integer-set.h:28,
                 from /content/kaldi/src/fstext/context-fst.h:61,
                 from /content/kaldi/src/fstext/fstext-lib.h:23,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:9:
/content/kaldi/src/base/kaldi-error.h:119:40: note: candidate: template<class T> kaldi::MessageLogger& kaldi::MessageLogger::operator<<(const T&)
   template <typename T> MessageLogger &operator<<(const T &val) {
                                        ^~~~~~~~
/content/kaldi/src/base/kaldi-error.h:119:40: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:31:55: note:   couldn't deduce template parameter ‘T’
             << num_states_after << " states " << std::endl;
                                                       ^~~~
In file included from /content/kaldi/src/matrix/sp-matrix.h:27:0,
                 from /content/kaldi/src/matrix/matrix-lib.h:28,
                 from /content/kaldi/src/util/table-types.h:26,
                 from /content/kaldi/src/util/common-utils.h:28,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:10:
/content/kaldi/src/matrix/packed-matrix.h:181:16: note: candidate: template<class Real> std::ostream& kaldi::operator<<(std::ostream&, const kaldi::PackedMatrix<Real>&)
 std::ostream & operator << (std::ostream & os, const PackedMatrix<Real>& M) {
                ^~~~~~~~
/content/kaldi/src/matrix/packed-matrix.h:181:16: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:31:55: note:   couldn't deduce template parameter ‘Real’
             << num_states_after << " states " << std::endl;
                                                       ^~~~
In file included from /content/kaldi/src/matrix/kaldi-vector.h:608:0,
                 from /content/kaldi/src/matrix/kaldi-matrix-inl.h:23,
                 from /content/kaldi/src/matrix/kaldi-matrix.h:1119,
                 from /content/kaldi/src/util/kaldi-io.h:31,
                 from /content/kaldi/src/fstext/fstext-utils-inl.h:27,
                 from /content/kaldi/src/fstext/fstext-utils.h:427,
                 from /content/kaldi/src/fstext/deterministic-fst-inl.h:25,
                 from /content/kaldi/src/fstext/deterministic-fst.h:333,
                 from /content/kaldi/src/fstext/context-fst.h:62,
                 from /content/kaldi/src/fstext/fstext-lib.h:23,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:9:
/content/kaldi/src/matrix/kaldi-vector-inl.h:30:16: note: candidate: template<class Real> std::ostream& kaldi::operator<<(std::ostream&, const kaldi::VectorBase<Real>&)
 std::ostream & operator << (std::ostream &os, const VectorBase<Real> &rv) {
                ^~~~~~~~
/content/kaldi/src/matrix/kaldi-vector-inl.h:30:16: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:31:55: note:   couldn't deduce template parameter ‘Real’
             << num_states_after << " states " << std::endl;
                                                       ^~~~
In file included from /content/kaldi/src/matrix/kaldi-matrix.h:1119:0,
                 from /content/kaldi/src/util/kaldi-io.h:31,
                 from /content/kaldi/src/fstext/fstext-utils-inl.h:27,
                 from /content/kaldi/src/fstext/fstext-utils.h:427,
                 from /content/kaldi/src/fstext/deterministic-fst-inl.h:25,
                 from /content/kaldi/src/fstext/deterministic-fst.h:333,
                 from /content/kaldi/src/fstext/context-fst.h:62,
                 from /content/kaldi/src/fstext/fstext-lib.h:23,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:9:
/content/kaldi/src/matrix/kaldi-matrix-inl.h:41:23: note: candidate: template<class Real> std::ostream& kaldi::operator<<(std::ostream&, const kaldi::MatrixBase<Real>&)
 inline std::ostream & operator << (std::ostream & os, const MatrixBase<Real> & M) {
                       ^~~~~~~~
/content/kaldi/src/matrix/kaldi-matrix-inl.h:41:23: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:31:55: note:   couldn't deduce template parameter ‘Real’
             << num_states_after << " states " << std::endl;
                                                       ^~~~
In file included from /content/kaldi/src/util/kaldi-table-inl.h:28:0,
                 from /content/kaldi/src/util/kaldi-table.h:469,
                 from /content/kaldi/src/util/common-utils.h:27,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:10:
/usr/include/c++/6/thread:263:5: note: candidate: template<class _CharT, class _Traits> std::basic_ostream<_CharT, _Traits>& std::operator<<(std::basic_ostream<_CharT, _Traits>&, std::thread::id)
     operator<<(basic_ostream<_CharT, _Traits>& __out, thread::id __id)
     ^~~~~~~~
/usr/include/c++/6/thread:263:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:31:55: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<_CharT, _Traits>’
             << num_states_after << " states " << std::endl;
                                                       ^~~~
In file included from /usr/include/c++/6/random:51:0,
                 from /content/kaldi/tools/openfst-1.6.7/include/fst/randgen.h:14,
                 from /content/kaldi/tools/openfst-1.6.7/include/fst/randequivalent.h:15,
                 from /content/kaldi/tools/openfst-1.6.7/include/fst/fstlib.h:61,
                 from /content/kaldi/src/fstext/fstext-lib.h:22,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:9:
/usr/include/c++/6/bits/random.tcc:3160:5: note: candidate: template<class _RealType, class _CharT, class _Traits> std::basic_ostream<_CharT, _Traits>& std::operator<<(std::basic_ostream<_CharT, _Traits>&, const std::piecewise_linear_distribution<_RealType>&)
     operator<<(std::basic_ostream<_CharT, _Traits>& __os,
     ^~~~~~~~
/usr/include/c++/6/bits/random.tcc:3160:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:31:55: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<_CharT, _Traits>’
             << num_states_after << " states " << std::endl;
                                                       ^~~~
.
.
.
.
.
.
.
.
/usr/include/c++/6/ostream:569:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<char, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/iostream:39:0,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/ostream:556:5: note: candidate: template<class _Traits> std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, _Traits>&, const char*)
     operator<<(basic_ostream<char, _Traits>& __out, const char* __s)
     ^~~~~~~~
/usr/include/c++/6/ostream:556:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<char, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/ostream:638:0,
                 from /usr/include/c++/6/iostream:39,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/bits/ostream.tcc:321:5: note: candidate: template<class _CharT, class _Traits> std::basic_ostream<_CharT, _Traits>& std::operator<<(std::basic_ostream<_CharT, _Traits>&, const char*)
     operator<<(basic_ostream<_CharT, _Traits>& __out, const char* __s)
     ^~~~~~~~
/usr/include/c++/6/bits/ostream.tcc:321:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<_CharT, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/iostream:39:0,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/ostream:539:5: note: candidate: template<class _CharT, class _Traits> std::basic_ostream<_CharT, _Traits>& std::operator<<(std::basic_ostream<_CharT, _Traits>&, const _CharT*)
     operator<<(basic_ostream<_CharT, _Traits>& __out, const _CharT* __s)
     ^~~~~~~~
/usr/include/c++/6/ostream:539:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<_CharT, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/iostream:39:0,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/ostream:519:5: note: candidate: template<class _Traits> std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, _Traits>&, unsigned char)
     operator<<(basic_ostream<char, _Traits>& __out, unsigned char __c)
     ^~~~~~~~
/usr/include/c++/6/ostream:519:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<char, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/iostream:39:0,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/ostream:514:5: note: candidate: template<class _Traits> std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, _Traits>&, signed char)
     operator<<(basic_ostream<char, _Traits>& __out, signed char __c)
     ^~~~~~~~
/usr/include/c++/6/ostream:514:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<char, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/iostream:39:0,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/ostream:508:5: note: candidate: template<class _Traits> std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, _Traits>&, char)
     operator<<(basic_ostream<char, _Traits>& __out, char __c)
     ^~~~~~~~
/usr/include/c++/6/ostream:508:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<char, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/iostream:39:0,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/ostream:502:5: note: candidate: template<class _CharT, class _Traits> std::basic_ostream<_CharT, _Traits>& std::operator<<(std::basic_ostream<_CharT, _Traits>&, char)
     operator<<(basic_ostream<_CharT, _Traits>& __out, char __c)
     ^~~~~~~~
/usr/include/c++/6/ostream:502:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<_CharT, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/iostream:39:0,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/ostream:497:5: note: candidate: template<class _CharT, class _Traits> std::basic_ostream<_CharT, _Traits>& std::operator<<(std::basic_ostream<_CharT, _Traits>&, _CharT)
     operator<<(basic_ostream<_CharT, _Traits>& __out, _CharT __c)
     ^~~~~~~~
/usr/include/c++/6/ostream:497:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<_CharT, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/bits/ios_base.h:46:0,
                 from /usr/include/c++/6/ios:42,
                 from /usr/include/c++/6/ostream:38,
                 from /usr/include/c++/6/iostream:39,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/system_error:209:5: note: candidate: template<class _CharT, class _Traits> std::basic_ostream<_CharT, _Traits>& std::operator<<(std::basic_ostream<_CharT, _Traits>&, const std::error_code&)
     operator<<(basic_ostream<_CharT, _Traits>& __os, const error_code& __e)
     ^~~~~~~~
/usr/include/c++/6/system_error:209:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<_CharT, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
In file included from /usr/include/c++/6/string:52:0,
                 from /usr/include/c++/6/bits/locale_classes.h:40,
                 from /usr/include/c++/6/bits/ios_base.h:41,
                 from /usr/include/c++/6/ios:42,
                 from /usr/include/c++/6/ostream:38,
                 from /usr/include/c++/6/iostream:39,
                 from /content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:8:
/usr/include/c++/6/bits/basic_string.h:5352:5: note: candidate: template<class _CharT, class _Traits, class _Alloc> std::basic_ostream<_CharT, _Traits>& std::operator<<(std::basic_ostream<_CharT, _Traits>&, const std::__cxx11::basic_string<_CharT, _Traits, _Alloc>&)
     operator<<(basic_ostream<_CharT, _Traits>& __os,
     ^~~~~~~~
/usr/include/c++/6/bits/basic_string.h:5352:5: note:   template argument deduction/substitution failed:
/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc:91:52: note:   ‘kaldi::MessageLogger’ is not derived from ‘std::basic_ostream<_CharT, _Traits>’
   KALDI_LOG << "Writing FST to " << output << std::endl;
                                                    ^~~~
Traceback (most recent call last):
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 589, in create_HLG
    check=True,
  File "/usr/local/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['c++', '-I/content/kaldi/src', '-I/content/kaldi/tools/openfst-1.6.7/include', '-L/content/kaldi/src/lib', PosixPath('/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple.cc'), '-lkaldi-base', '-lkaldi-fstext', '-o', PosixPath('/content/fairseq/examples/speech_recognition/kaldi/add-self-loop-simple')]' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 698, in <module>
    cli_main()
  File "/usr/local/lib/python3.7/site-packages/hydra/main.py", line 37, in decorated_main
    strict=strict,
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 347, in _run_hydra
    lambda: hydra.run(
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
    raise ex
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 350, in <lambda>
    overrides=args.overrides,
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 112, in run
    configure_logging=with_log_configuration,
  File "/usr/local/lib/python3.7/site-packages/hydra/core/utils.py", line 127, in run_job
    ret.return_value = task_function(task_cfg)
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 677, in cli_main
    initalize_kaldi(cfg)
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 662, in initalize_kaldi
    hlg_graph = create_HLG(kaldi_root, fst_dir, unique_label, hlga_graph)
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 606, in create_HLG
    logger.error(f"cmd: {e.cmd}, err: {e.stderr.decode('utf-8')}")
AttributeError: 'NoneType' object has no attribute 'decode'
=== 1/5 Counting and sorting n-grams ===
Reading /content/output_dir/phones/lm.phones.filtered.txt
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
tcmalloc: large alloc 2174918656 bytes == 0x55a1d1ecc000 @  0x7ffb82ea31e7 0x55a1d0ece7b2 0x55a1d0e6952e 0x55a1d0e482fb 0x55a1d0e34076 0x7ffb81c1abf7 0x55a1d0e35bba
tcmalloc: large alloc 8699674624 bytes == 0x55a2538f6000 @  0x7ffb82ea31e7 0x55a1d0ece7b2 0x55a1d0ebd7da 0x55a1d0ebe218 0x55a1d0e48318 0x55a1d0e34076 0x7ffb81c1abf7 0x55a1d0e35bba
****************************************************************************************************
Unigram tokens 11296 types 147
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:1764 2:1855277568 3:3478645504 4:5565832704
tcmalloc: large alloc 5565833216 bytes == 0x55a1d1ecc000 @  0x7ffb82ea31e7 0x55a1d0ece7b2 0x55a1d0ebd7da 0x55a1d0ebe218 0x55a1d0e488e7 0x55a1d0e34076 0x7ffb81c1abf7 0x55a1d0e35bba
tcmalloc: large alloc 1855283200 bytes == 0x55a31dad6000 @  0x7ffb82ea31e7 0x55a1d0ece7b2 0x55a1d0ebd7da 0x55a1d0ebe218 0x55a1d0e48ced 0x55a1d0e34076 0x7ffb81c1abf7 0x55a1d0e35bba
tcmalloc: large alloc 3478650880 bytes == 0x55a45aa4e000 @  0x7ffb82ea31e7 0x55a1d0ece7b2 0x55a1d0ebd7da 0x55a1d0ebe218 0x55a1d0e48ced 0x55a1d0e34076 0x7ffb81c1abf7 0x55a1d0e35bba
Statistics:
1 147 D1=0.291667 D2=0.816176 D3+=2.28986
2 1665 D1=0.576602 D2=1.0156 D3+=1.7868
3 4856 D1=0.746 D2=1.14335 D3+=1.42097
4 7264 D1=0.746082 D2=1.31869 D3+=1.50271
Memory estimate for binary LM:
type     kB
probing 284 assuming -p 1.5
probing 323 assuming -r models -p 1.5
trie    104 without quantization
trie     52 assuming -q 8 -b 8 quantization 
trie    100 assuming -a 22 array pointer compression
trie     48 assuming -a 22 -q 8 -b 8 array pointer compression and quantization
=== 3/5 Calculating and sorting initial probabilities ===
Chain sizes: 1:1764 2:26640 3:97120 4:174336
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
####################################################################################################
=== 4/5 Calculating and writing order-interpolated probabilities ===
Chain sizes: 1:1764 2:26640 3:97120 4:174336
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
####################################################################################################
=== 5/5 Writing ARPA model ===
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Name:lmplz  VmPeak:14372664 kB  VmRSS:2145888 kB    RSSMax:2157908 kB   user:0.185599   sys:1.04275 CPU:1.22838 real:1.22858
Reading /content/output_dir/phones/lm.phones.filtered.04.arpa
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
SUCCESS
=== 1/5 Counting and sorting n-grams ===
Reading /content/output_dir/phones/lm.phones.filtered.txt
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
tcmalloc: large alloc 1717043200 bytes == 0x561fb5394000 @  0x7f8c4d4e51e7 0x561fb31bc7b2 0x561fb315752e 0x561fb31362fb 0x561fb3122076 0x7f8c4c25cbf7 0x561fb3123bba
tcmalloc: large alloc 9157550080 bytes == 0x56201b914000 @  0x7f8c4d4e51e7 0x561fb31bc7b2 0x561fb31ab7da 0x561fb31ac218 0x561fb3136318 0x561fb3122076 0x7f8c4c25cbf7 0x561fb3123bba
****************************************************************************************************
Unigram tokens 11296 types 147
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:1764 2:670754240 3:1257664128 4:2012262656 5:2934549760 6:4024525312
tcmalloc: large alloc 4024532992 bytes == 0x561fb5394000 @  0x7f8c4d4e51e7 0x561fb31bc7b2 0x561fb31ab7da 0x561fb31ac218 0x561fb31368e7 0x561fb3122076 0x7f8c4c25cbf7 0x561fb3123bba
tcmalloc: large alloc 2012266496 bytes == 0x5621180cc000 @  0x7f8c4d4e51e7 0x561fb31bc7b2 0x561fb31ab7da 0x561fb31ac218 0x561fb3136ced 0x561fb3122076 0x7f8c4c25cbf7 0x561fb3123bba
tcmalloc: large alloc 2934554624 bytes == 0x56223df12000 @  0x7f8c4d4e51e7 0x561fb31bc7b2 0x561fb31ab7da 0x561fb31ac218 0x561fb3136ced 0x561fb3122076 0x7f8c4c25cbf7 0x561fb3123bba
Statistics:
1 147 D1=0.291667 D2=0.816176 D3+=2.28986
2 1665 D1=0.576602 D2=1.0156 D3+=1.7868
3 4856 D1=0.746 D2=1.14335 D3+=1.42097
4 7264 D1=0.838578 D2=1.32695 D3+=1.83417
5 8506 D1=0.905781 D2=1.51429 D3+=1.43324
6 9104 D1=0.878897 D2=1.43203 D3+=1.59948
Memory estimate for binary LM:
type     kB
probing 686 assuming -p 1.5
probing 817 assuming -r models -p 1.5
trie    277 without quantization
trie    132 assuming -q 8 -b 8 quantization 
trie    260 assuming -a 22 array pointer compression
trie    115 assuming -a 22 -q 8 -b 8 array pointer compression and quantization
=== 3/5 Calculating and sorting initial probabilities ===
Chain sizes: 1:1764 2:26640 3:97120 4:174336 5:238168 6:291328
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
####################################################################################################
=== 4/5 Calculating and writing order-interpolated probabilities ===
Chain sizes: 1:1764 2:26640 3:97120 4:174336 5:238168 6:291328
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
####################################################################################################
=== 5/5 Writing ARPA model ===
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Name:lmplz  VmPeak:14005240 kB  VmRSS:1699192 kB    RSSMax:1718788 kB   user:0.16732    sys:0.764892    CPU:0.932259    real:0.933479
Reading /content/output_dir/phones/lm.phones.filtered.06.arpa
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
SUCCESS
[2021-06-29 08:23:21,151][__main__][INFO] - Creating /content/output_dir/fst/phn_to_phn_sil/kaldi_dict.phn.txt
[2021-06-29 08:23:21,152][__main__][INFO] - Creating /content/output_dir/fst/phn_to_phn_sil/G_lm.phones.filtered.06.fst
/content/kaldi/src/lmbin/arpa2fst --disambig-symbol=#0 --write-symbol-table=/content/output_dir/fst/phn_to_phn_sil/kaldi_dict.lm.phones.filtered.06.txt /content/output_dir/phones/lm.phones.filtered.06.arpa /content/output_dir/fst/phn_to_phn_sil/G_lm.phones.filtered.06.fst 
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.489~1-effe]:HeaderAvailable():arpa-lm-compiler.cc:300) Reverting to slower state tracking because model is large: 6-gram with symbols up to 150
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \4-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \5-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:Read():arpa-file-parser.cc:149) Reading \6-grams: section.
LOG (arpa2fst[5.5.489~1-effe]:RemoveRedundantStates():arpa-lm-compiler.cc:359) Reduced num-states from 22154 to 22153
[2021-06-29 08:23:21,290][__main__][INFO] - Creating /content/output_dir/fst/phn_to_phn_sil/kaldi_lexicon.phn.lm.phones.filtered.06.txt (in units file: /content/output_dir/fst/phn_to_phn_sil/kaldi_dict.phn.txt)
[2021-06-29 08:23:21,311][__main__][INFO] - Creating /content/output_dir/fst/phn_to_phn_sil/H.phn.fst
[2021-06-29 08:23:21,366][__main__][INFO] - Creating /content/output_dir/fst/phn_to_phn_sil/L.phn.lm.phones.filtered.06.fst (in units: /content/output_dir/fst/phn_to_phn_sil/kaldi_dict.phn_disambig.txt)
Traceback (most recent call last):
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 698, in <module>
    cli_main()
  File "/usr/local/lib/python3.7/site-packages/hydra/main.py", line 37, in decorated_main
    strict=strict,
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 347, in _run_hydra
    lambda: hydra.run(
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
    raise ex
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 350, in <lambda>
    overrides=args.overrides,
  File "/usr/local/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 112, in run
    configure_logging=with_log_configuration,
  File "/usr/local/lib/python3.7/site-packages/hydra/core/utils.py", line 127, in run_job
    ret.return_value = task_function(task_cfg)
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 677, in cli_main
    initalize_kaldi(cfg)
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 654, in initalize_kaldi
    out_words_file,
  File "/content/fairseq/examples/speech_recognition/kaldi/kaldi_initializer.py", line 205, in create_L
    assert len(res.stderr) == 0, res.stderr.decode("utf-8")
AssertionError: FATAL: FstCompiler: Symbol "iː1" is not mapped to any integer arc olabel, symbol table = /content/output_dir/fst/phn_to_phn_sil/kaldi_dict.lm.phones.filtered.06.txt, source = standard input, line = 112

Code

zsh ./fairseq/examples/wav2vec/unsupervised/scripts/prepare_text.sh vi ./corpus.txt ./output_dir 4 espeak ./lid.176.bin

I used threshold = 4, because I have an small corpus of vietnamese language.

#### What have you tried? #### What's your environment? - fairseq Version (e.g., 1.0 or master): just cloned the repo - PyTorch Version (e.g., 1.0): 1.9.0+cu102 - OS (e.g., Linux): Ubuntu 18.04 - How you installed fairseq (`pip`, source): both pip and from source - Build command you used (if compiling from source): pip install --editable ./fairseq - Python version: 3.7.5 - CUDA/cuDNN version: 10.1 - GPU models and configuration: K80 - Any other relevant information: I also build Kaldi from source
cdleong commented 3 years ago

Just wanted to mention that #3591 may be relevant

roger-tseng commented 1 year ago

For those with similar problems, #3702 mentions a solution. examples/speech_recognition/kaldi/add-self-loop-simple.cc attempts to use std::endl with KALDI_LOG and fails. Removing all std::endl in the code fixes the issue.