Closed ken2190 closed 5 months ago
I train single speaker model from scratch like as Instructions to run in README.MD but i get this error when start training
CUDA_LAUNCH_BLOCKING=1 python pflow/train.py experiment=ljspeech
[[36m2024-02-08 19:28:43,268[0m][[34mpflow.utils.utils[0m][[32mINFO[0m] - Enforcing tags! <cfg.extras.enforce_tags=True>[0m [[36m2024-02-08 19:28:43,279[0m][[34mpflow.utils.utils[0m][[32mINFO[0m] - Printing config tree with Rich! <cfg.extras.print_config=True>[0m CONFIG ├── data │ └── _target_: pflow.data.text_mel_datamodule.TextMelDataModule │ name: ljspeech │ train_filelist_path: /home/ubuntu/LJSpeech/LJSpeech-1.1/filelists/ljs_audio_text_train_filelist.txt │ valid_filelist_path: /home/ubuntu/LJSpeech/LJSpeech-1.1/filelists/ljs_audio_text_val_filelist.txt │ batch_size: 32 │ num_workers: 4 │ pin_memory: true │ cleaners: │ - english_cleaners2 │ add_blank: true │ n_spks: 1 │ n_fft: 1024 │ n_feats: 80 │ sample_rate: 22050 │ hop_length: 256 │ win_length: 1024 │ f_min: 0 │ f_max: 8000 │ data_statistics: │ mel_mean: -5.523591995239258 │ mel_std: 2.0658047199249268 │ seed: 1234 │ min_sample_size: 4 │ ├── model │ └── _target_: pflow.models.pflow_tts.pflowTTS │ n_vocab: 178 │ n_spks: 1 │ spk_emb_dim: 64 │ n_feats: 80 │ data_statistics: │ mel_mean: -5.523591995239258 │ mel_std: 2.0658047199249268 │ out_size: null │ prompt_size: 264 │ dur_p_use_log: false │ encoder: │ encoder_type: RoPE Encoder │ encoder_params: │ n_feats: 80 │ n_channels: 192 │ filter_channels: 768 │ filter_channels_dp: 256 │ n_heads: 2 │ n_layers: 6 │ kernel_size: 3 │ p_dropout: 0.1 │ spk_emb_dim: 64 │ n_spks: 1 │ prenet: true │ duration_predictor_params: │ filter_channels_dp: 256 │ kernel_size: 3 │ p_dropout: 0.1 │ decoder: │ channels: │ - 256 │ - 256 │ dropout: 0.05 │ attention_head_dim: 64 │ n_blocks: 1 │ num_mid_blocks: 2 │ num_heads: 2 │ act_fn: snakebeta │ cfm: │ name: CFM │ solver: euler │ sigma_min: 0.0001 │ optimizer: │ _target_: torch.optim.Adam │ _partial_: true │ lr: 0.0001 │ weight_decay: 0.0 │ ├── callbacks │ └── model_checkpoint: │ _target_: lightning.pytorch.callbacks.ModelCheckpoint │ dirpath: /mnt/e/pflowtts_pytorch/logs/train/ljspeech/runs/2024-02-08_19-28-43/checkpoints │ filename: checkpoint_{epoch:03d} │ monitor: epoch │ verbose: false │ save_last: true │ save_top_k: 10 │ mode: max │ auto_insert_metric_name: true │ save_weights_only: false │ every_n_train_steps: null │ train_time_interval: null │ every_n_epochs: 100 │ save_on_train_epoch_end: null │ model_summary: │ _target_: lightning.pytorch.callbacks.RichModelSummary │ max_depth: 3 │ rich_progress_bar: │ _target_: lightning.pytorch.callbacks.RichProgressBar │ ├── logger │ └── tensorboard: │ _target_: lightning.pytorch.loggers.tensorboard.TensorBoardLogger │ save_dir: /mnt/e/pflowtts_pytorch/logs/train/ljspeech/runs/2024-02-08_19-28-43/tensorboard/ │ name: null │ log_graph: false │ default_hp_metric: true │ prefix: '' │ ├── trainer │ └── _target_: lightning.pytorch.trainer.Trainer │ default_root_dir: /mnt/e/pflowtts_pytorch/logs/train/ljspeech/runs/2024-02-08_19-28-43 │ max_epochs: -1 │ accelerator: gpu │ devices: │ - 0 │ precision: 16-mixed │ check_val_every_n_epoch: 1 │ deterministic: false │ gradient_clip_val: 5.0 │ ├── paths │ └── root_dir: /mnt/e/pflowtts_pytorch │ data_dir: /mnt/e/pflowtts_pytorch/data/ │ log_dir: /mnt/e/pflowtts_pytorch/logs/ │ output_dir: /mnt/e/pflowtts_pytorch/logs/train/ljspeech/runs/2024-02-08_19-28-43 │ work_dir: /mnt/e/pflowtts_pytorch │ ├── extras │ └── ignore_warnings: false │ enforce_tags: true │ print_config: true │ ├── task_name │ └── train ├── run_name │ └── ljspeech ├── tags │ └── ['ljspeech'] ├── train │ └── True ├── test │ └── True ├── ckpt_path │ └── None ├── transfer_ckpt_path │ └── None └── seed └── 1234 Seed set to 1234 [[36m2024-02-08 19:28:43,371[0m][[34m__main__[0m][[32mINFO[0m] - Instantiating datamodule <pflow.data.text_mel_datamodule.TextMelDataModule>[0m [[36m2024-02-08 19:28:44,798[0m][[34m__main__[0m][[32mINFO[0m] - Instantiating model <pflow.models.pflow_tts.pflowTTS>[0m [[36m2024-02-08 19:28:45,572[0m][[34m__main__[0m][[32mINFO[0m] - Instantiating callbacks...[0m [[36m2024-02-08 19:28:45,572[0m][[34mpflow.utils.instantiators[0m][[32mINFO[0m] - Instantiating callback <lightning.pytorch.callbacks.ModelCheckpoint>[0m [[36m2024-02-08 19:28:45,578[0m][[34mpflow.utils.instantiators[0m][[32mINFO[0m] - Instantiating callback <lightning.pytorch.callbacks.RichModelSummary>[0m [[36m2024-02-08 19:28:45,579[0m][[34mpflow.utils.instantiators[0m][[32mINFO[0m] - Instantiating callback <lightning.pytorch.callbacks.RichProgressBar>[0m [[36m2024-02-08 19:28:45,580[0m][[34m__main__[0m][[32mINFO[0m] - Instantiating loggers...[0m [[36m2024-02-08 19:28:45,580[0m][[34mpflow.utils.instantiators[0m][[32mINFO[0m] - Instantiating logger <lightning.pytorch.loggers.tensorboard.TensorBoardLogger>[0m [[36m2024-02-08 19:28:45,717[0m][[34m__main__[0m][[32mINFO[0m] - Instantiating trainer <lightning.pytorch.trainer.Trainer>[0m Using 16bit Automatic Mixed Precision (AMP) Trainer already configured with model summary callbacks: [<class 'lightning.pytorch.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback. GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs [[36m2024-02-08 19:28:46,661[0m][[34m__main__[0m][[32mINFO[0m] - Instantiating transfer learning...[0m [[36m2024-02-08 19:28:46,661[0m][[34m__main__[0m][[32mINFO[0m] - Logging hyperparameters![0m [[36m2024-02-08 19:28:46,854[0m][[34m__main__[0m][[32mINFO[0m] - Starting training![0m LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0] ┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓ ┃ ┃ Name ┃ Type ┃ Params ┃ ┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩ │ 0 │ encoder │ TextEncoder │ 21.1 M │ │ 1 │ encoder.emb │ Embedding │ 34.2 K │ │ 2 │ encoder.speech_prompt_proj │ Conv1d │ 15.6 K │ │ 3 │ encoder.prenet │ ConvReluNorm │ 591 K │ │ 4 │ encoder.prenet.conv_layers │ ModuleList │ 553 K │ │ 5 │ encoder.prenet.norm_layers │ ModuleList │ 1.2 K │ │ 6 │ encoder.prenet.relu_drop │ Sequential │ 0 │ │ 7 │ encoder.prenet.proj │ Conv1d │ 37.1 K │ │ 8 │ encoder.speech_prompt_encoder │ Encoder │ 6.2 M │ │ 9 │ encoder.speech_prompt_encoder.drop │ Dropout │ 0 │ │ 10 │ encoder.speech_prompt_encoder.attn_layers │ ModuleList │ 889 K │ │ 11 │ encoder.speech_prompt_encoder.norm_layers_1 │ ModuleList │ 2.3 K │ │ 12 │ encoder.speech_prompt_encoder.ffn_layers │ ModuleList │ 5.3 M │ │ 13 │ encoder.speech_prompt_encoder.norm_layers_2 │ ModuleList │ 2.3 K │ │ 14 │ encoder.text_base_encoder │ Encoder │ 6.2 M │ │ 15 │ encoder.text_base_encoder.drop │ Dropout │ 0 │ │ 16 │ encoder.text_base_encoder.attn_layers │ ModuleList │ 889 K │ │ 17 │ encoder.text_base_encoder.norm_layers_1 │ ModuleList │ 2.3 K │ │ 18 │ encoder.text_base_encoder.ffn_layers │ ModuleList │ 5.3 M │ │ 19 │ encoder.text_base_encoder.norm_layers_2 │ ModuleList │ 2.3 K │ │ 20 │ encoder.decoder │ Decoder │ 7.1 M │ │ 21 │ encoder.decoder.drop │ Dropout │ 0 │ │ 22 │ encoder.decoder.self_attn_layers │ ModuleList │ 889 K │ │ 23 │ encoder.decoder.norm_layers_0 │ ModuleList │ 2.3 K │ │ 24 │ encoder.decoder.encdec_attn_layers │ ModuleList │ 889 K │ │ 25 │ encoder.decoder.norm_layers_1 │ ModuleList │ 2.3 K │ │ 26 │ encoder.decoder.ffn_layers │ ModuleList │ 5.3 M │ │ 27 │ encoder.decoder.norm_layers_2 │ ModuleList │ 2.3 K │ │ 28 │ encoder.transformerblock │ BasicTransformerBlock │ 592 K │ │ 29 │ encoder.transformerblock.norm1 │ LayerNorm │ 384 │ │ 30 │ encoder.transformerblock.attn1 │ Attention │ 147 K │ │ 31 │ encoder.transformerblock.norm2 │ LayerNorm │ 384 │ │ 32 │ encoder.transformerblock.attn2 │ Attention │ 147 K │ │ 33 │ encoder.transformerblock.norm3 │ LayerNorm │ 384 │ │ 34 │ encoder.transformerblock.ff │ FeedForward │ 295 K │ │ 35 │ encoder.proj_m │ Conv1d │ 15.4 K │ │ 36 │ encoder.proj_w │ DurationPredictor │ 345 K │ │ 37 │ encoder.proj_w.drop │ Dropout │ 0 │ │ 38 │ encoder.proj_w.conv_1 │ Conv1d │ 147 K │ │ 39 │ encoder.proj_w.norm_1 │ LayerNorm │ 512 │ │ 40 │ encoder.proj_w.conv_2 │ Conv1d │ 196 K │ │ 41 │ encoder.proj_w.norm_2 │ LayerNorm │ 512 │ │ 42 │ encoder.proj_w.proj │ Conv1d │ 257 │ │ 43 │ decoder │ CFM │ 11.0 M │ │ 44 │ decoder.estimator │ Decoder │ 11.0 M │ │ 45 │ decoder.estimator.time_embeddings │ SinusoidalPosEmb │ 0 │ │ 46 │ decoder.estimator.time_mlp │ TimestepEmbedding │ 1.2 M │ │ 47 │ decoder.estimator.down_blocks │ ModuleList │ 3.1 M │ │ 48 │ decoder.estimator.mid_blocks │ ModuleList │ 2.8 M │ │ 49 │ decoder.estimator.up_blocks │ ModuleList │ 3.7 M │ │ 50 │ decoder.estimator.final_block │ Block1D │ 197 K │ │ 51 │ decoder.estimator.final_proj │ Conv1d │ 20.6 K │ │ 52 │ proj_prompt │ Conv1d │ 15.4 K │ └────┴─────────────────────────────────────────────┴───────────────────────┴────────┘ Trainable params: 32.1 M Non-trainable params: 0 Total params: 32.1 M Total estimated model params size (MB): 128 /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [43,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [45,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [46,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [48,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [49,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [50,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [51,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [52,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [53,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [54,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [55,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [4,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [5,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [6,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [7,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [8,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [10,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [11,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [12,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [13,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [14,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [15,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [16,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [17,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [18,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [19,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [20,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [21,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [22,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [23,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [25,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [26,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [27,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [28,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [29,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [270,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [96,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [97,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [98,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [99,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [100,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [101,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [102,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [103,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [104,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [105,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [332,0,0], thread: [106,0,0] Assertion `srcIndex < srcSelectDimSize` failed. .... /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [193,0,0], thread: [5,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [193,0,0], thread: [6,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [193,0,0], thread: [7,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [193,0,0], thread: [8,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [193,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [193,0,0], thread: [10,0,0] Assertion `srcIndex < srcSelectDimSize` failed. ... /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [27,0,0], thread: [64,0,0] Assertion `srcIndex < srcSelectDimSize` failed. ... /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [180,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [180,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [180,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [27,0,0], thread: [96,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [27,0,0], thread: [97,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [27,0,0], thread: [98,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [27,0,0], thread: [99,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [27,0,0], thread: [100,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [27,0,0], thread: [101,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [27,0,0], thread: [102,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [175,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [175,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [175,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [175,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [175,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [175,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [175,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [175,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed. ... /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed. ... /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [121,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [122,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [123,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [124,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [125,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [126,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [28,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [333,0,0], thread: [10,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [333,0,0], thread: [11,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [116,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [117,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [118,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [119,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [174,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed. ... /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [123,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [87,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [88,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed. /opt/pytorch/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1289: indexSelectLargeIndex: block: [261,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed. [[36m2024-02-08 19:28:50,087[0m][[34mpflow.utils.utils[0m][[31mERROR[0m] - [0m Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/call.py", line 44, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 579, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 986, in _run results = self._run_stage() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 1030, in _run_stage self._run_sanity_check() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 1059, in _run_sanity_check val_loop.run() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/utilities.py", line 182, in _decorator return loop_run(self, *args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/evaluation_loop.py", line 135, in run self._evaluation_step(batch, batch_idx, dataloader_idx, dataloader_iter) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/evaluation_loop.py", line 396, in _evaluation_step output = call._call_strategy_hook(trainer, hook_name, *step_args) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/call.py", line 309, in _call_strategy_hook output = fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/strategies/strategy.py", line 412, in validation_step return self.lightning_module.validation_step(*args, **kwargs) File "/mnt/e/pflowtts_pytorch/pflow/models/baselightningmodule.py", line 150, in validation_step loss_dict, attn_dict = self.get_losses(batch) File "/mnt/e/pflowtts_pytorch/pflow/models/baselightningmodule.py", line 73, in get_losses dur_loss, prior_loss, diff_loss, attn = self( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl return forward_call(*args, **kwargs) File "/mnt/e/pflowtts_pytorch/pflow/models/pflow_tts.py", line 125, in forward mu_x, logw, x_mask = self.encoder(x, x_lengths, prompt_slice) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl return forward_call(*args, **kwargs) File "/mnt/e/pflowtts_pytorch/pflow/models/components/speech_prompt_encoder.py", line 596, in forward x_emb = self.emb(x_input) * math.sqrt(self.n_channels) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 162, in forward return F.embedding( File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2238, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/mnt/e/pflowtts_pytorch/pflow/utils/utils.py", line 76, in wrap metric_dict, object_dict = task_func(cfg=cfg) File "/mnt/e/pflowtts_pytorch/pflow/train.py", line 97, in train trainer.fit(model=model, datamodule=datamodule) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 543, in fit call._call_and_handle_interrupt( File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/call.py", line 68, in _call_and_handle_interrupt trainer._teardown() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 1009, in _teardown self.strategy.teardown() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/strategies/strategy.py", line 537, in teardown self.lightning_module.cpu() File "/usr/local/lib/python3.10/dist-packages/lightning/fabric/utilities/device_dtype_mixin.py", line 82, in cpu return super().cpu() File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 959, in cpu return self._apply(lambda t: t.cpu()) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 825, in _apply param_applied = fn(param) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 959, in <lambda> return self._apply(lambda t: t.cpu()) RuntimeError: CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. [[36m2024-02-08 19:28:50,102[0m][[34mpflow.utils.utils[0m][[32mINFO[0m] - Output dir: /mnt/e/pflowtts_pytorch/logs/train/ljspeech/runs/2024-02-08_19-28-43[0m Error executing job with overrides: ['experiment=ljspeech'] Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/call.py", line 44, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 579, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 986, in _run results = self._run_stage() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 1030, in _run_stage self._run_sanity_check() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 1059, in _run_sanity_check val_loop.run() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/utilities.py", line 182, in _decorator return loop_run(self, *args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/evaluation_loop.py", line 135, in run self._evaluation_step(batch, batch_idx, dataloader_idx, dataloader_iter) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/loops/evaluation_loop.py", line 396, in _evaluation_step output = call._call_strategy_hook(trainer, hook_name, *step_args) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/call.py", line 309, in _call_strategy_hook output = fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/strategies/strategy.py", line 412, in validation_step return self.lightning_module.validation_step(*args, **kwargs) File "/mnt/e/pflowtts_pytorch/pflow/models/baselightningmodule.py", line 150, in validation_step loss_dict, attn_dict = self.get_losses(batch) File "/mnt/e/pflowtts_pytorch/pflow/models/baselightningmodule.py", line 73, in get_losses dur_loss, prior_loss, diff_loss, attn = self( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl return forward_call(*args, **kwargs) File "/mnt/e/pflowtts_pytorch/pflow/models/pflow_tts.py", line 125, in forward mu_x, logw, x_mask = self.encoder(x, x_lengths, prompt_slice) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl return forward_call(*args, **kwargs) File "/mnt/e/pflowtts_pytorch/pflow/models/components/speech_prompt_encoder.py", line 596, in forward x_emb = self.emb(x_input) * math.sqrt(self.n_channels) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 162, in forward return F.embedding( File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2238, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/mnt/e/pflowtts_pytorch/pflow/train.py", line 130, in main metric_dict, _ = train(cfg) File "/mnt/e/pflowtts_pytorch/pflow/utils/utils.py", line 86, in wrap raise ex File "/mnt/e/pflowtts_pytorch/pflow/utils/utils.py", line 76, in wrap metric_dict, object_dict = task_func(cfg=cfg) File "/mnt/e/pflowtts_pytorch/pflow/train.py", line 97, in train trainer.fit(model=model, datamodule=datamodule) File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 543, in fit call._call_and_handle_interrupt( File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/call.py", line 68, in _call_and_handle_interrupt trainer._teardown() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/trainer/trainer.py", line 1009, in _teardown self.strategy.teardown() File "/usr/local/lib/python3.10/dist-packages/lightning/pytorch/strategies/strategy.py", line 537, in teardown self.lightning_module.cpu() File "/usr/local/lib/python3.10/dist-packages/lightning/fabric/utilities/device_dtype_mixin.py", line 82, in cpu return super().cpu() File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 959, in cpu return self._apply(lambda t: t.cpu()) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 802, in _apply module._apply(fn) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 825, in _apply param_applied = fn(param) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 959, in <lambda> return self._apply(lambda t: t.cpu()) RuntimeError: CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. terminate called after throwing an instance of 'c10::Error' what(): CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. Exception raised from c10_cuda_check_implementation at /opt/pytorch/pytorch/c10/cuda/CUDAException.cpp:44 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xae (0x7f0f2d1b12ce in /usr/local/lib/python3.10/dist-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xf3 (0x7f0f2d16798b in /usr/local/lib/python3.10/dist-packages/torch/lib/libc10.so) frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x3f2 (0x7f0f36fd5e72 in /usr/local/lib/python3.10/dist-packages/torch/lib/libc10_cuda.so) frame #3: <unknown function> + 0xd56951 (0x7f0ea92f8951 in /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_cuda.so) frame #4: <unknown function> + 0xd58c00 (0x7f0ea92fac00 in /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_cuda.so) frame #5: <unknown function> + 0x48333a (0x7f0eef3ed33a in /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_python.so) frame #6: c10::TensorImpl::~TensorImpl() + 0xd (0x7f0f2d18cedd in /usr/local/lib/python3.10/dist-packages/torch/lib/libc10.so) frame #7: <unknown function> + 0x742d68 (0x7f0eef6acd68 in /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_python.so) frame #8: THPVariable_subclass_dealloc(_object*) + 0x2e6 (0x7f0eef6ad0c6 in /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_python.so) <omitting python frames> frame #34: <unknown function> + 0x29d90 (0x7f0f39b81d90 in /lib/x86_64-linux-gnu/libc.so.6) frame #35: __libc_start_main + 0x80 (0x7f0f39b81e40 in /lib/x86_64-linux-gnu/libc.so.6)
Additional Info GPU V100S 32GB
most probably your dataset is smaller than 270 mel frames? Try to reduce the prompt size while training and let me know how it goes.
i set prompt size to 259 like as Experiments from paper and the error is gone Thanks you
I train single speaker model from scratch like as Instructions to run in README.MD but i get this error when start training
CUDA_LAUNCH_BLOCKING=1 python pflow/train.py experiment=ljspeech
Additional Info GPU V100S 32GB