openspeech-team / openspeech

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
https://openspeech-team.github.io/openspeech/
MIT License
670 stars 112 forks source link

min_pct = 1.0 / len(dataloader) error ZeroDivisionError: float division by zero #217

Closed oh-young-data closed 7 months ago

oh-young-data commented 7 months ago

❓ Questions & Help

hello

I'm going to test it with Ksponspeech.

However, there is a problem with the dataloder part.

How can I solve this?

Details

dataset=ksponspeech dataset.dataset_path=T:/KS/KsponSpeech/ dataset.manifest_file_path=T:/KS/ksponspeech_manifest.txt dataset.test_dataset_path=T:/KS/KsponSpeech_eval/ dataset.test_manifest_dir=T:/KS/KsponSpeech_scripts/ tokenizer.vocab_path=T:/KS/aihub_labels.csv tokenizer=kspon_character model=listen_attend_spell audio=melspectrogram lr_scheduler=warmup_reduce_lr_on_plateau trainer=gpu criterion=cross_entropy

C:\Anaconda3\envs\openspeech\python.exe E:/PyProjectss/openspeech/openspeech_cli/hydra_train.py dataset=ksponspeech dataset.dataset_path=T:/KS/KsponSpeech/ dataset.manifest_file_path=T:/KS/ksponspeech_manifest.txt dataset.test_dataset_path=T:/KS/KsponSpeech_eval/ dataset.test_manifest_dir=T:/KS/KsponSpeech_scripts/ tokenizer.vocab_path=T:/KS/aihub_labels.csv tokenizer=kspon_character model=listen_attend_spell audio=melspectrogram lr_scheduler=warmup_reduce_lr_on_plateau trainer=gpu criterion=cross_entropy E:\PyProjectss\openspeech\openspeech\utils.py:88: FutureWarning: Pass y=[ 1.0289368e-05 1.9799588e-06 2.5269967e-06 ... 4.2585348e-06 -7.8615221e-06 -1.8652894e-05] as keyword args. From version 0.10 passing these as positional arguments will result in an error DUMMY_FEATURES = librosa.feature.melspectrogram(DUMMY_SIGNALS, n_mels=80) E:/PyProjectss/openspeech/openspeech_cli/hydra_train.py:37: UserWarning: The version_base parameter is not specified. Please specify a compatability version level, or None. Will assume defaults for version 1.1 @hydra.main(config_path=os.path.join("..", "openspeech", "configs"), config_name="train") C:\Anaconda3\envs\openspeech\lib\site-packages\hydra\core\default_element.py:128: UserWarning: In 'train': Usage of deprecated keyword in package header '# @package group'. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/changes_to_package_header for more information See {url} for more information""" C:\Anaconda3\envs\openspeech\lib\site-packages\hydra_internal\hydra.py:127: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default. See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information. configure_logging=with_log_configuration, audio: name: melspectrogram sample_rate: 16000 frame_length: 20.0 frame_shift: 10.0 del_silence: false num_mels: 80 apply_spec_augment: true apply_noise_augment: false apply_time_stretch_augment: false apply_joining_augment: false augment: apply_spec_augment: false apply_noise_augment: false apply_joining_augment: false apply_time_stretch_augment: false freq_mask_para: 27 freq_mask_num: 2 time_mask_num: 4 noise_dataset_dir: None noise_level: 0.7 time_stretch_min_rate: 0.7 time_stretch_max_rate: 1.4 dataset: dataset: ksponspeech dataset_path: T:/KS/KsponSpeech/ test_dataset_path: T:/KS/KsponSpeech_eval/ manifest_file_path: T:/KS/ksponspeech_manifest.txt test_manifest_dir: T:/KS/KsponSpeech_scripts/ preprocess_mode: phonetic criterion: criterion_name: cross_entropy reduction: mean lr_scheduler: lr: 0.0001 scheduler_name: warmup_reduce_lr_on_plateau lr_patience: 1 lr_factor: 0.3 peak_lr: 0.0001 init_lr: 1.0e-10 warmup_steps: 4000 model: model_name: listen_attend_spell num_encoder_layers: 3 num_decoder_layers: 2 hidden_state_dim: 512 encoder_dropout_p: 0.3 encoder_bidirectional: true rnn_type: lstm joint_ctc_attention: false max_length: 128 num_attention_heads: 1 decoder_dropout_p: 0.2 decoder_attn_mechanism: dot teacher_forcing_ratio: 1.0 optimizer: adam trainer: seed: 1 accelerator: dp accumulate_grad_batches: 1 num_workers: 4 batch_size: 32 check_val_every_n_epoch: 1 gradient_clip_val: 5.0 logger: wandb max_epochs: 20 save_checkpoint_n_steps: 10000 auto_scale_batch_size: binsearch sampler: else name: gpu device: gpu use_cuda: true auto_select_gpus: true tokenizer: sos_token: eos_token: pad_token: blank_token: encoding: utf-8 unit: kspon_character vocab_path: T:/KS/aihub_labels.csv

Global seed set to 1 [2024-01-26 15:57:17,559][openspeech.utils][INFO] - audio: name: melspectrogram sample_rate: 16000 frame_length: 20.0 frame_shift: 10.0 del_silence: false num_mels: 80 apply_spec_augment: true apply_noise_augment: false apply_time_stretch_augment: false apply_joining_augment: false augment: apply_spec_augment: false apply_noise_augment: false apply_joining_augment: false apply_time_stretch_augment: false freq_mask_para: 27 freq_mask_num: 2 time_mask_num: 4 noise_dataset_dir: None noise_level: 0.7 time_stretch_min_rate: 0.7 time_stretch_max_rate: 1.4 dataset: dataset: ksponspeech dataset_path: T:/KS/KsponSpeech/ test_dataset_path: T:/KS/KsponSpeech_eval/ manifest_file_path: T:/KS/ksponspeech_manifest.txt test_manifest_dir: T:/KS/KsponSpeech_scripts/ preprocess_mode: phonetic criterion: criterion_name: cross_entropy reduction: mean lr_scheduler: lr: 0.0001 scheduler_name: warmup_reduce_lr_on_plateau lr_patience: 1 lr_factor: 0.3 peak_lr: 0.0001 init_lr: 1.0e-10 warmup_steps: 4000 model: model_name: listen_attend_spell num_encoder_layers: 3 num_decoder_layers: 2 hidden_state_dim: 512 encoder_dropout_p: 0.3 encoder_bidirectional: true rnn_type: lstm joint_ctc_attention: false max_length: 128 num_attention_heads: 1 decoder_dropout_p: 0.2 decoder_attn_mechanism: dot teacher_forcing_ratio: 1.0 optimizer: adam trainer: seed: 1 accelerator: dp accumulate_grad_batches: 1 num_workers: 4 batch_size: 32 check_val_every_n_epoch: 1 gradient_clip_val: 5.0 logger: wandb max_epochs: 20 save_checkpoint_n_steps: 10000 auto_scale_batch_size: binsearch sampler: else name: gpu device: gpu use_cuda: true auto_select_gpus: true tokenizer: sos_token: eos_token: pad_token: blank_token: encoding: utf-8 unit: kspon_character vocab_path: T:/KS/aihub_labels.csv

[2024-01-26 15:57:17,600][openspeech.utils][INFO] - Operating System : Windows 10 [2024-01-26 15:57:17,600][openspeech.utils][INFO] - Processor : Intel64 Family 6 Model 158 Stepping 9, GenuineIntel [2024-01-26 15:57:17,603][openspeech.utils][INFO] - device : NVIDIA GeForce RTX 4070 [2024-01-26 15:57:17,603][openspeech.utils][INFO] - CUDA is available : True [2024-01-26 15:57:17,603][openspeech.utils][INFO] - CUDA version : 11.7 [2024-01-26 15:57:17,603][openspeech.utils][INFO] - PyTorch version : 1.13.0+cu117 wandb: WARNING resume will be ignored since W&B syncing is set to offline. Starting a new run with run id 137iy86e. wandb: Tracking run with wandb version 0.13.4 wandb: W&B syncing is set to offline in this directory.
wandb: Run wandb online or set WANDB_MODE=online to enable cloud syncing. C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py:288: LightningDeprecationWarning: Passing Trainer(accelerator='dp') has been deprecated in v1.5 and will be removed in v1.7. Use Trainer(strategy='dp') instead. f"Passing Trainer(accelerator={accelerator!r}) has been deprecated" GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:377: LightningDeprecationWarning: The Callback.on_batch_end hook was deprecated in v1.6 and will be removed in v1.8. Please use Callback.on_train_batch_end instead. f"The Callback.{hook} hook was deprecated in v1.6 and" Sanity Checking: 0it [00:00, ?it/s]LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | criterion | CrossEntropyLoss | 0
1 | encoder | LSTMEncoder | 15.0 M 2 | decoder | LSTMAttentionDecoder | 21.2 M

36.2 M Trainable params 0 Non-trainable params 36.2 M Total params 144.974 Total estimated model params size (MB) C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\utilities\data.py:128: UserWarning: Total length of AudioDataLoader across ranks is zero. Please make sure this was your intention. f"Total length of {dataloader.__class__.__name__} across ranks is zero." Error executing job with overrides: ['dataset=ksponspeech', 'dataset.dataset_path=T:/KS/KsponSpeech/', 'dataset.manifest_file_path=T:/KS/ksponspeech_manifest.txt', 'dataset.test_dataset_path=T:/KS/KsponSpeech_eval/', 'dataset.test_manifest_dir=T:/KS/KsponSpeech_scripts/', 'tokenizer.vocab_path=T:/KS/aihub_labels.csv', 'tokenizer=kspon_character', 'model=listen_attend_spell', 'audio=melspectrogram', 'lr_scheduler=warmup_reduce_lr_on_plateau', 'trainer=gpu', 'criterion=cross_entropy'] Traceback (most recent call last): File "E:/PyProjectss/openspeech/openspeech_cli/hydra_train.py", line 53, in hydra_main trainer.fit(model, data_module) File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 772, in fit self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 724, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 812, in _fit_impl results = self._run(model, ckpt_path=self.ckpt_path) File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1237, in _run results = self._run_stage() File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1324, in _run_stage return self._run_train() File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1346, in _run_train self._run_sanity_check() File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1407, in _run_sanity_check val_loop._reload_evaluation_dataloaders() File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 239, in _reload_evaluation_dataloaders self.trainer.reset_val_dataloader() File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1960, in reset_val_dataloader RunningStage.VALIDATING, model=pl_module File "C:\Anaconda3\envs\openspeech\lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py", line 424, in _reset_eval_dataloader min_pct = 1.0 / len(dataloader) ZeroDivisionError: float division by zero

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. wandb: Waiting for W&B process to finish... (failed 1). wandb: You can sync this run to the cloud by running: wandb: wandb sync E:\PyProjectss\openspeech\openspeech_cli\outputs\2024-01-26\15-57-17\wandb\offline-run-20240126_155718-137iy86e wandb: Find logs at: .\wandb\offline-run-20240126_155718-137iy86e\logs

Process finished with exit code 1C:\Anaconda3\envs\openspeech\python.exe

upskyy commented 7 months ago

@oh-young-data When I looked at it, it didn't seem like the dataset was entered correctly, so I think it would be good to debug to see if the dataset was entered correctly.

oh-young-data commented 7 months ago

Thank you for answer

I thought the data path was strange and kept checking, but couldn't find the problem.

I don't know, but training will proceed as the project is created again.

Perhaps a problem occurred while editing the project.