openspeech-team / openspeech

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
https://openspeech-team.github.io/openspeech/
MIT License
670 stars 112 forks source link

hydra_lm_train.py 사용법 #216

Open apg0001 opened 7 months ago

apg0001 commented 7 months ago

❓ Questions & Help

수고 많으십니다. 음성인식을 공부하고 있는 학생입니다. openspeech 사용 중 막힌 부분이 있어 질문드립니다.

  1. hydra_lm_train.py 파일을 실행시키면 다음과 같은 오류가 발생했다고 합니다. tokenizer가 없는 keyword라고 하는데 tokenizer를 지정 안 해주면 지정해주라고 하는데,,, 제 환경이 문제인 걸까요?
  2. hydra_lm_train.py 역할이 무엇인가요? language model을 만드는 것인지 Acoustic model에 language model을 붙여서 새로운 모델을 만드는 것인지 궁금합니다. 이게 아니라면 acoustic model에 다른 language model을 붙이는 방법이 있을까요?
  3. hydra_lm_train.py 사용 예시 코드를 볼 수 있을까요?

감사합니다.

Details

--입력 python ./openspeech_cli/hydra_lm_train.py dataset=ksponspeech dataset.dataset_path=C:\Users\lab1080\Desktop\openspeech\KsponSpeech dataset.test_dataset_path=C:\Users\lab1080\Desktop\openspeech\KsponSpeech_eval dataset.test_manifest_dir=C:\Users\lab1080\Desktop\openspeech\KsponSpeech_scripts dataset.manifest_file_path=C:\Users\lab1080\Desktop\openspeech\KSPONSPEECH_AUTO_MANIFEST model=listen_attend_spell lr_scheduler=warmup_reduce_lr_on_plateau trainer=gpu criterion=cross_entropy tokenizer=kspon_character

--출력 [2024-01-25 13:34:14,597][openspeech.utils][INFO] - dataset: dataset: ksponspeech dataset_path: C:\Users\lab1080\Desktop\openspeech\KsponSpeech test_dataset_path: C:\Users\lab1080\Desktop\openspeech\KsponSpeech_eval manifest_file_path: C:\Users\lab1080\Desktop\openspeech\KSPONSPEECH_AUTO_MANIFEST test_manifest_dir: C:\Users\lab1080\Desktop\openspeech\KsponSpeech_scripts preprocess_mode: phonetic criterion: criterion_name: cross_entropy reduction: mean lr_scheduler: lr: 0.0001 scheduler_name: warmup_reduce_lr_on_plateau lr_patience: 1 lr_factor: 0.3 peak_lr: 0.0001 init_lr: 1.0e-10 warmup_steps: 4000 model: model_name: listen_attend_spell num_encoder_layers: 3 num_decoder_layers: 2 hidden_state_dim: 512 encoder_dropout_p: 0.3 encoder_bidirectional: true rnn_type: lstm joint_ctc_attention: false max_length: 128 num_attention_heads: 1 decoder_dropout_p: 0.2 decoder_attn_mechanism: dot teacher_forcing_ratio: 1.0 optimizer: adam trainer: seed: 1 accelerator: dp accumulate_grad_batches: 1 num_workers: 4 batch_size: 32 check_val_every_n_epoch: 1 gradient_clip_val: 5.0 logger: wandb max_epochs: 20 save_checkpoint_n_steps: 10000 auto_scale_batch_size: binsearch sampler: else name: gpu device: gpu use_cuda: true auto_select_gpus: true tokenizer: sos_token: eos_token: pad_token: blank_token: encoding: utf-8 unit: kspon_character vocab_path: ../../../aihub_labels.csv

[2024-01-25 13:34:14,606][openspeech.utils][INFO] - Operating System : Windows 10 [2024-01-25 13:34:14,606][openspeech.utils][INFO] - Processor : Intel64 Family 6 Model 85 Stepping 4, GenuineIntel [2024-01-25 13:34:14,607][openspeech.utils][INFO] - CUDA is available : False [2024-01-25 13:34:14,607][openspeech.utils][INFO] - PyTorch version : 1.13.1+cpu wandb: Currently logged in as: apg0001 (dguyanglab). Use wandb login --relogin to force relogin wandb: wandb version 0.16.2 is available! To upgrade, please run: wandb: $ pip install wandb --upgrade wandb: Tracking run with wandb version 0.15.12 wandb: Run data is saved locally in C:\Users\lab1080\Desktop\openspeech\outputs\2024-01-25\13-34-14\wandb\run-20240125_133417-5gndlwut wandb: Run wandb offline to turn off syncing. wandb: Syncing run listen_attend_spell-ksponspeech wandb: View project at https://wandb.ai/dguyanglab/listen_attend_spell-ksponspeech wandb: View run at https://wandb.ai/dguyanglab/listen_attend_spell-ksponspeech/runs/5gndlwut Traceback (most recent call last): File "./openspeech_cli/hydra_lm_train.py", line 45, in hydra_main data_module.setup(tokenizer=tokenizer) TypeError: setup() got an unexpected keyword argument 'tokenizer'