openspeech-team / openspeech

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
https://openspeech-team.github.io/openspeech/
MIT License
670 stars 112 forks source link

초보적인 질문이라 죄송합니다 #202

Closed youngchannelforyou closed 1 year ago

youngchannelforyou commented 1 year ago

❓ Questions & Help

../../../aihub_labels.csv를 찾을 수 없다는 에러가 발생합니다. 전처리시 생성된다는 글을 본 것 같으나 전처리 내용이 따로 readme에 없어서요!... 따로 디렉토리에도 있지 않은거 같습니다!... 도와주세요....

아래는 실행 코드 및 출력 로그입니다.

python3 ./openspeech_cli/hydra_train.py dataset=ksponspeech dataset.dataset_path=../../../../openspeech/KsponSpeech dataset.manifest_file_path=../../../../openspeech/KsponSpeech_scripts dataset.test_dataset_path=../../../../openspeech/KsponSpeech dataset.test_manifest_dir=../../../../openspeech/KsponSpeech_scripts tokenizer=kspon_character model=listen_attend_spell audio=melspectrogram lr_scheduler=warmup_reduce_lr_on_plateau trainer=gpu criterion=cross_entropy

/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.16) or chardet (5.1.0) doesn't match a supported version! warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported " 2023-07-20 20:08:55.414514: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2023-07-20 20:08:55.434672: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-07-20 20:08:55.687565: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT ./openspeech_cli/hydra_train.py:46: UserWarning: The version_base parameter is not specified. Please specify a compatability version level, or None. Will assume defaults for version 1.1 @hydra.main( /home/net/.local/lib/python3.8/site-packages/hydra/core/default_element.py:124: UserWarning: In 'train': Usage of deprecated keyword in package header '# @package group'. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/changes_to_package_header for more information deprecation_warning( /home/net/.local/lib/python3.8/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default. See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information. ret = run_job( audio: name: melspectrogram sample_rate: 16000 frame_length: 20.0 frame_shift: 10.0 del_silence: false num_mels: 80 apply_spec_augment: true apply_noise_augment: false apply_time_stretch_augment: false apply_joining_augment: false augment: apply_spec_augment: false apply_noise_augment: false apply_joining_augment: false apply_time_stretch_augment: false freq_mask_para: 27 freq_mask_num: 2 time_mask_num: 4 noise_dataset_dir: None noise_level: 0.7 time_stretch_min_rate: 0.7 time_stretch_max_rate: 1.4 dataset: dataset: ksponspeech dataset_path: ../../../../openspeech/KsponSpeech test_dataset_path: ../../../../openspeech/KsponSpeech manifest_file_path: ../../../../openspeech/KsponSpeech_scripts test_manifest_dir: ../../../../openspeech/KsponSpeech_scripts preprocess_mode: phonetic criterion: criterion_name: cross_entropy reduction: mean lr_scheduler: lr: 0.0001 scheduler_name: warmup_reduce_lr_on_plateau lr_patience: 1 lr_factor: 0.3 peak_lr: 0.0001 init_lr: 1.0e-10 warmup_steps: 4000 model: model_name: listen_attend_spell num_encoder_layers: 3 num_decoder_layers: 2 hidden_state_dim: 512 encoder_dropout_p: 0.3 encoder_bidirectional: true rnn_type: lstm joint_ctc_attention: false max_length: 128 num_attention_heads: 1 decoder_dropout_p: 0.2 decoder_attn_mechanism: dot teacher_forcing_ratio: 1.0 optimizer: adam trainer: seed: 1 accelerator: dp accumulate_grad_batches: 1 num_workers: 4 batch_size: 32 check_val_every_n_epoch: 1 gradient_clip_val: 5.0 logger: wandb max_epochs: 20 save_checkpoint_n_steps: 10000 auto_scale_batch_size: binsearch sampler: else name: gpu device: gpu use_cuda: true auto_select_gpus: true tokenizer: sos_token: eos_token: pad_token: blank_token: encoding: utf-8 unit: kspon_character vocab_path: ../../../aihub_labels.csv

Global seed set to 1 [2023-07-20 20:08:56,712][openspeech.utils][INFO] - audio: name: melspectrogram sample_rate: 16000 frame_length: 20.0 frame_shift: 10.0 del_silence: false num_mels: 80 apply_spec_augment: true apply_noise_augment: false apply_time_stretch_augment: false apply_joining_augment: false augment: apply_spec_augment: false apply_noise_augment: false apply_joining_augment: false apply_time_stretch_augment: false freq_mask_para: 27 freq_mask_num: 2 time_mask_num: 4 noise_dataset_dir: None noise_level: 0.7 time_stretch_min_rate: 0.7 time_stretch_max_rate: 1.4 dataset: dataset: ksponspeech dataset_path: ../../../../openspeech/KsponSpeech test_dataset_path: ../../../../openspeech/KsponSpeech manifest_file_path: ../../../../openspeech/KsponSpeech_scripts test_manifest_dir: ../../../../openspeech/KsponSpeech_scripts preprocess_mode: phonetic criterion: criterion_name: cross_entropy reduction: mean lr_scheduler: lr: 0.0001 scheduler_name: warmup_reduce_lr_on_plateau lr_patience: 1 lr_factor: 0.3 peak_lr: 0.0001 init_lr: 1.0e-10 warmup_steps: 4000 model: model_name: listen_attend_spell num_encoder_layers: 3 num_decoder_layers: 2 hidden_state_dim: 512 encoder_dropout_p: 0.3 encoder_bidirectional: true rnn_type: lstm joint_ctc_attention: false max_length: 128 num_attention_heads: 1 decoder_dropout_p: 0.2 decoder_attn_mechanism: dot teacher_forcing_ratio: 1.0 optimizer: adam trainer: seed: 1 accelerator: dp accumulate_grad_batches: 1 num_workers: 4 batch_size: 32 check_val_every_n_epoch: 1 gradient_clip_val: 5.0 logger: wandb max_epochs: 20 save_checkpoint_n_steps: 10000 auto_scale_batch_size: binsearch sampler: else name: gpu device: gpu use_cuda: true auto_select_gpus: true tokenizer: sos_token: eos_token: pad_token: blank_token: encoding: utf-8 unit: kspon_character vocab_path: ../../../aihub_labels.csv

[2023-07-20 20:08:57,487][openspeech.utils][INFO] - Operating System : Linux 5.15.0-76-generic [2023-07-20 20:08:57,487][openspeech.utils][INFO] - Processor : x86_64 [2023-07-20 20:08:57,496][openspeech.utils][INFO] - device : NVIDIA GeForce RTX 3090 [2023-07-20 20:08:57,497][openspeech.utils][INFO] - CUDA is available : True [2023-07-20 20:08:57,497][openspeech.utils][INFO] - CUDA version : 11.8 [2023-07-20 20:08:57,497][openspeech.utils][INFO] - PyTorch version : 2.0.1+cu118 Error executing job with overrides: ['dataset=ksponspeech', 'dataset.dataset_path=../../../../openspeech/KsponSpeech', 'dataset.manifest_file_path=../../../../openspeech/KsponSpeech_scripts', 'dataset.test_dataset_path=../../../../openspeech/KsponSpeech', 'dataset.test_manifest_dir=../../../../openspeech/KsponSpeech_scripts', 'tokenizer=kspon_character', 'model=listen_attend_spell', 'audio=melspectrogram', 'lr_scheduler=warmup_reduce_lr_on_plateau', 'trainer=gpu', 'criterion=cross_entropy'] Traceback (most recent call last): File "/home/net/바탕화면/openspeech/openspeech/tokenizers/ksponspeech/character.py", line 123, in load_vocab with open(vocab_path, "r", encoding=encoding) as f: FileNotFoundError: [Errno 2] No such file or directory: '../../../aihub_labels.csv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./openspeech_cli/hydra_train.py", line 57, in hydra_main tokenizer = TOKENIZER_REGISTRYconfigs.tokenizer.unit File "/home/net/바탕화면/openspeech/openspeech/tokenizers/ksponspeech/character.py", line 50, in init self.vocab_dict, self.id_dict = self.load_vocab( File "/home/net/바탕화면/openspeech/openspeech/tokenizers/ksponspeech/character.py", line 133, in load_vocab raise IOError("Character label file (csv format) doesnt exist : {0}".format(vocab_path)) OSError: Character label file (csv format) doesnt exist : ../../../aihub_labels.csv

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

upskyy commented 1 year ago

@youngchannelforyou 실행할 때 인자로 tokenizer.vocab_path도 같이 넣어주시겠어요?

youngchannelforyou commented 1 year ago

KsponSpeech 데이터로 현재 시도중인데 tokenizer.vocab_path 파일 패키지 내에 준비 돼 있을까요? 아니면 추가적인 작업으로 vocab 파일을 생성해야 할까요?...

upskyy commented 1 year ago

전처리하면서 label 파일이 생성될거예요.

youngchannelforyou commented 1 year ago

감사합니다! 해당 에러 해결됐습니다!