JoungheeKim / K-wav2vec

Apache License 2.0
82 stars 15 forks source link

TypeError: build_alphabet() got an unexpected keyword argument 'ctc_token_idx' #11

Open sangheonEN opened 2 days ago

sangheonEN commented 2 days ago

(kwav2vec_env_py3_8) root@4aa7addb6281:/home# bash script/inference/evaluate_multimodel.sh INFO:main:Namespace(add_weight=0.5, additional_output=False, autoregressive=False, batch_size=8, batch_size_valid=8, beam=100, bf16=False, checkpoint_path='/home/save_checkpoint/finetune/ksponspeech/multi_model/checkpoint_best.pt', constraints=None, cpu=False, criterion='multi_ctc', data='/home/transcriptions/ksponspeech/grapheme_character_spelling', decoder='beam', decoding_format=None, del_silence=False, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, enable_padding=False, eval_wer=False, eval_wer_post_process='letter', eval_wer_tokenizer=None, experiments_dir='experiments/ksponspeech/multi_model_dev.csv', fp16=False, gen_subset='dev', iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, labels='ltr', lenpen=1, lm_path=None, lm_weight=0.0, log_format='tqdm', log_interval=1, match_source_len=False, max_len_a=0, max_len_b=200, max_sample_size=None, max_tokens=4000000, min_len=1, min_sample_size=None, multi_use_update=0, nbest=1, no_beamable_mm=False, no_early_stop=False, no_repeat_ngram_size=0, no_seed_provided=True, normalize=False, post_process='letter', prefix_size=0, print_alignment=None, print_step=False, replace_unk=None, results_path='eval_log/ksponspeech/multi_model_dev', retain_dropout=False, retain_dropout_modules=None, retain_iter_history=False, sacrebleu=False, sample_rate=16000, sampling=False, sampling_topk=-1, sampling_topp=-1.0, score_reference=False, seed=1, task='audio_multitraining', temperature=1.0, tpu=False, unkpen=0, unnormalized=False, wer_args=None, wer_kenlm_model=None, wer_lexicon=None, wer_lm_weight=2.0, wer_word_score=-1.0, zero_infinity=False) INFO:main:| decoding with criterion multi_ctc INFO:main:| loading model(s) from /home/save_checkpoint/finetune/ksponspeech/multi_model/checkpoint_best.pt INFO:fairseq.data.audio.raw_audio_dataset:loaded 2545, skipped 0 samples INFO:main:| /home/transcriptions/ksponspeech/grapheme_character_spelling dev 2545 examples Traceback (most recent call last): File "inference/beam_search.py", line 902, in wer, cer, swer = cli_main() File "inference/beam_search.py", line 836, in cli_main task, wer, cer, swer = main(args) File "inference/beam_search.py", line 704, in main generator = build_generator(args, tgt_dict, add_tgt_dict) File "inference/beam_search.py", line 697, in build_generator return BeamDecoder(args, tgt_dict, add_tgt_dict) File "inference/beam_search.py", line 336, in init alphabet = Alphabet.build_alphabet(vocab_list, ctc_token_idx=0) TypeError: build_alphabet() got an unexpected keyword argument 'ctc_token_idx'

build_alphabet()함수에서 'ctc_token_idx' 매개변수는 없다고 뜨네요. 혹시 코드가 변경된걸까요?

sangheonEN commented 2 days ago

pyctcdecode 라이브러리 버전을 0.5.0 -> 0.1.0으로 변경하니 작동됩니다.