RuntimeError: CUDA error: device-side assert triggered

VladC12 commented 4 years ago

I'm using "train-clean-100" from LibriTTS to train it. I also changed the sampling rate to 24000

` ❯ python train.py -c config.json -p train_config.output_directory=outdir train_config.output_directory=outdir output_directory=outdir {'train_config': {'output_directory': 'outdir', 'epochs': 10000000, 'learning_rate': 0.0001, 'weight_decay': 1e-06, 'sigma': 1.0, 'iters_per_checkpoint': 5000, 'batch_size': 1, 'seed': 1234, 'checkpoint_path': '', 'ignore_layers': [], 'include_layers': ['speaker', 'encoder', 'embedding'], 'warmstart_checkpoint_path': '', 'with_tensorboard': True, 'fp16_run': False}, 'data_config': {'training_files': 'filelists/libritts_train_clean_100_audiopath_text_sid_shorterthan10s_atleast5min_train_filelist.txt', 'validation_files': 'filelists/libritts_train_clean_100_audiopath_text_sid_atleast5min_val_filelist.txt', 'text_cleaners': ['flowtron_cleaners'], 'p_arpabet': 0.5, 'cmudict_path': 'data/cmudict_dictionary', 'sampling_rate': 24000, 'filter_length': 1024, 'hop_length': 256, 'win_length': 1024, 'mel_fmin': 0.0, 'mel_fmax': 8000.0, 'max_wav_value': 32768.0}, 'dist_config': {'dist_backend': 'nccl', 'dist_url': 'tcp://localhost:54321'}, 'model_config': {'n_speakers': 1, 'n_speaker_dim': 128, 'n_text': 185, 'n_text_dim': 512, 'n_flows': 2, 'n_mel_channels': 80, 'n_attn_channels': 640, 'n_hidden': 1024, 'n_lstm_layers': 2, 'mel_encoder_n_hidden': 512, 'n_components': 0, 'mean_scale': 0.0, 'fixed_gaussian': True, 'dummy_speaker_embedding': False, 'use_gate_layer': True}}

got rank 0 and world size 1 ... Flowtron( (speaker_embedding): Embedding(1, 128) (embedding): Embedding(185, 512) (flows): ModuleList( (0): AR_Step( (conv): Conv1d(1024, 160, kernel_size=(1,), stride=(1,)) (lstm): LSTM(1664, 1024, num_layers=2) (attention_lstm): LSTM(80, 1024) (attention_layer): Attention( (softmax): Softmax(dim=2) (query): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=640, bias=False) ) (key): LinearNorm( (linear_layer): Linear(in_features=640, out_features=640, bias=False) ) (value): LinearNorm( (linear_layer): Linear(in_features=640, out_features=640, bias=False) ) (v): LinearNorm( (linear_layer): Linear(in_features=640, out_features=1, bias=False) ) ) (dense_layer): DenseLayer( (layers): ModuleList( (0): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=1024, bias=True) ) (1): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=1024, bias=True) ) ) ) ) (1): AR_Back_Step( (ar_step): AR_Step( (conv): Conv1d(1024, 160, kernel_size=(1,), stride=(1,)) (lstm): LSTM(1664, 1024, num_layers=2) (attention_lstm): LSTM(80, 1024) (attention_layer): Attention( (softmax): Softmax(dim=2) (query): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=640, bias=False) ) (key): LinearNorm( (linear_layer): Linear(in_features=640, out_features=640, bias=False) ) (value): LinearNorm( (linear_layer): Linear(in_features=640, out_features=640, bias=False) ) (v): LinearNorm( (linear_layer): Linear(in_features=640, out_features=1, bias=False) ) ) (dense_layer): DenseLayer( (layers): ModuleList( (0): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=1024, bias=True) ) (1): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=1024, bias=True) ) ) ) (gate_layer): LinearNorm( (linear_layer): Linear(in_features=1664, out_features=1, bias=True) ) ) ) ) (encoder): Encoder( (convolutions): ModuleList( (0): Sequential( (0): ConvNorm( (conv): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,)) ) (1): InstanceNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) ) (1): Sequential( (0): ConvNorm( (conv): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,)) ) (1): InstanceNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) ) (2): Sequential( (0): ConvNorm( (conv): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,)) ) (1): InstanceNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) ) ) (lstm): LSTM(512, 256, batch_first=True, bidirectional=True) ) ) Number of speakers : 123 output directory outdir Epoch: 0 C:\AI_Research_Project\flowtron\data.py:40: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:141.) return torch.from_numpy(data).float(), sampling_rate C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [1,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [2,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [3,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [4,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [5,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [8,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [9,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [10,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [11,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [12,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [13,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [14,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [15,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [16,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [17,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [18,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [19,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [20,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [21,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [22,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [23,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [24,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [25,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [26,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [27,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [28,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [29,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [30,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [108,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [109,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [110,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [111,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [112,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [113,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [114,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [115,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [116,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [117,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [118,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [119,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [120,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [121,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [122,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [123,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [124,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [125,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [126,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [127,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [32,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [33,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [34,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [35,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [36,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [37,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [38,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [39,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [40,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [41,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [42,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [43,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [44,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [45,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [46,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [47,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [48,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [49,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [50,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [51,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [52,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [53,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [54,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [55,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [56,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [57,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [58,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [59,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [60,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [61,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [62,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [63,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [64,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [65,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [66,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [67,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [68,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [69,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [70,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [71,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [72,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [73,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [74,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [75,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [76,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [77,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [78,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [79,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [80,0,0] Assertion srcIndex < srcSelectDiTraceback (most recent call last): mSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [81,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [82,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex. File "train.py", line 300, in cu:218: block: [0,0,0], thread: [83,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/ THCTensorIndex.cu:218: block: [0,0,0], thretrain(n_gpus, rank, train_config)ad: [84,0,0] Assert File "train.py", line 225, in train ion srcIndex < srcSelectDimSize failed. mel, speaker_vecs, text, in_lens, out_lens)C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], th read: [85,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000 File "C:\Users\vladc\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl 000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [86,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [87,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [88,0,0] Assertion srcIndex < srcSelectDimresult = self.forward(*input, **kwargs) Size File "C:\AI_Research_Project\flowtron\flowtron.py", line 577, in forward failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu: text = self.encoder(text, in_lens)218: block: [0,0,0], thread: [89,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorInde File "C:\Users\vladc\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl x.cu:218: block: [0,0,0], thread: [90,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_10 00000000000/work/aten/src/THC/THCTensorIndex.cu:218result = self.forward(*input, kwargs): block: [0,0,0], thread: [91,0 ,0 File "C:\AI_Research_Project\flowtron\flowtron.py", line 322, in forward ] Assertion srcx = F.dropout(F.relu(conv(x)), 0.5, self.training)Inde x < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [92,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [0,0,0], thread: [93,0,0] Assertion srcIndex < srcSelectDimSize failed. File "C:\Users\vladc\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:218: block: [ result = self.forward(*input, *kwargs)0,0,0], thread: [94,0,0] Assertion srcIndex < srcSelectDimSize failed. C:/cb/pytorch_1000000000000/work/aten/src/THC/THCTensorIndex.cu:21 8: b File "C:\Users\vladc\anaconda3\lib\site-packages\torch\nn\modules\container.py", line 117, in forward lock: [0,0,0], thread: [95,0,0] Assertion srcIndex < srcSelectDimSize failed. input = module(input) File "C:\Users\vladc\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl result = self.forward(input, **kwargs) File "C:\Users\vladc\anaconda3\lib\site-packages\torch\nn\modules\instancenorm.py", line 57, in forward self.training or not self.track_running_stats, self.momentum, self.eps) File "C:\Users\vladc\anaconda3\lib\site-packages\torch\nn\functional.py", line 2038, in instance_norm use_input_stats, momentum, eps, torch.backends.cudnn.enabled RuntimeError: CUDA error: device-side assert triggered`

adrianastan commented 4 years ago

I would recommend sticking to the 22k sampling frequency so that you can use the pretrained Waveglow model at inference.

Regarding your error it might be that because of the higher sampling rate some samples might not fit into the GPU memory. So you could at first try to remove some of the longer ones from the training list.

VladC12 commented 4 years ago

I would recommend sticking to the 22k sampling frequency so that you can use the pretrained Waveglow model at inference.

Regarding your error it might be that because of the higher sampling rate some samples might not fit into the GPU memory. So you could at first try to remove some of the longer ones from the training list.

I get this error instead now: `python train.py -c config.json -p train_config.output_directory=outdir train_config.output_directory=outdir output_directory=outdir {'train_config': {'output_directory': 'outdir', 'epochs': 10000000, 'learning_rate': 0.0001, 'weight_decay': 1e-06, 'sigma': 1.0, 'iters_per_checkpoint': 5000, 'batch_size': 1, 'seed': 1234, 'checkpoint_path': '', 'ignore_layers': [], 'include_layers': ['speaker', 'encoder', 'embedding'], 'warmstart_checkpoint_path': '', 'with_tensorboard': True, 'fp16_run': False}, 'data_config': {'training_files': 'filelists/libritts_train_clean_100_audiopath_text_sid_shorterthan10s_atleast5min_train_filelist.txt', 'validation_files': 'filelists/libritts_train_clean_100_audiopath_text_sid_atleast5min_val_filelist.txt', 'text_cleaners': ['flowtron_cleaners'], 'p_arpabet': 0.5, 'cmudict_path': 'data/cmudict_dictionary', 'sampling_rate': 22000, 'filter_length': 1024, 'hop_length': 256, 'win_length': 1024, 'mel_fmin': 0.0, 'mel_fmax': 8000.0, 'max_wav_value': 32768.0}, 'dist_config': {'dist_backend': 'nccl', 'dist_url': 'tcp://localhost:54321'}, 'model_config': {'n_speakers': 1, 'n_speaker_dim': 128, 'n_text': 185, 'n_text_dim': 512, 'n_flows': 2, 'n_mel_channels': 80, 'n_attn_channels': 640, 'n_hidden': 1024, 'n_lstm_layers': 2, 'mel_encoder_n_hidden': 512, 'n_components': 0, 'mean_scale': 0.0, 'fixed_gaussian': True, 'dummy_speaker_embedding': False, 'use_gate_layer': True}}

got rank 0 and world size 1 ... Flowtron( (speaker_embedding): Embedding(1, 128) (embedding): Embedding(185, 512) (flows): ModuleList( (0): AR_Step( (conv): Conv1d(1024, 160, kernel_size=(1,), stride=(1,)) (lstm): LSTM(1664, 1024, num_layers=2) (attention_lstm): LSTM(80, 1024) (attention_layer): Attention( (softmax): Softmax(dim=2) (query): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=640, bias=False) ) (key): LinearNorm( (linear_layer): Linear(in_features=640, out_features=640, bias=False) ) (value): LinearNorm( (linear_layer): Linear(in_features=640, out_features=640, bias=False) ) (v): LinearNorm( (linear_layer): Linear(in_features=640, out_features=1, bias=False) ) ) (dense_layer): DenseLayer( (layers): ModuleList( (0): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=1024, bias=True) ) (1): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=1024, bias=True) ) ) ) ) (1): AR_Back_Step( (ar_step): AR_Step( (conv): Conv1d(1024, 160, kernel_size=(1,), stride=(1,)) (lstm): LSTM(1664, 1024, num_layers=2) (attention_lstm): LSTM(80, 1024) (attention_layer): Attention( (softmax): Softmax(dim=2) (query): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=640, bias=False) ) (key): LinearNorm( (linear_layer): Linear(in_features=640, out_features=640, bias=False) ) (value): LinearNorm( (linear_layer): Linear(in_features=640, out_features=640, bias=False) ) (v): LinearNorm( (linear_layer): Linear(in_features=640, out_features=1, bias=False) ) ) (dense_layer): DenseLayer( (layers): ModuleList( (0): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=1024, bias=True) ) (1): LinearNorm( (linear_layer): Linear(in_features=1024, out_features=1024, bias=True) ) ) ) (gate_layer): LinearNorm( (linear_layer): Linear(in_features=1664, out_features=1, bias=True) ) ) ) ) (encoder): Encoder( (convolutions): ModuleList( (0): Sequential( (0): ConvNorm( (conv): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,)) ) (1): InstanceNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) ) (1): Sequential( (0): ConvNorm( (conv): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,)) ) (1): InstanceNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) ) (2): Sequential( (0): ConvNorm( (conv): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,)) ) (1): InstanceNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) ) ) (lstm): LSTM(512, 256, batch_first=True, bidirectional=True) ) ) Number of speakers : 123 output directory outdir Epoch: 0 C:\AI_Research_Project\flowtron\data.py:40: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:141.) return torch.from_numpy(data).float(), sampling_rate Traceback (most recent call last): File "train.py", line 300, in train(n_gpus, rank, **train_config) File "train.py", line 217, in train for batch in train_loader: File "C:\Users\vladc\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 363, in next data = self._next_data() File "C:\Users\vladc\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 989, in _next_data return self._process_data(data) File "C:\Users\vladc\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1014, in _process_data data.reraise() File "C:\Users\vladc\anaconda3\lib\site-packages\torch_utils.py", line 395, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\Users\vladc\anaconda3\lib\site-packages\torch\utils\data_utils\worker.py", line 185, in _worker_loop data = fetcher.fetch(index) File "C:\Users\vladc\anaconda3\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\vladc\anaconda3\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\AI_Research_Project\flowtron\data.py", line 100, in getitem sampling_rate, self.sampling_rate)) ValueError: 24000 SR doesn't match target 22000 SR`

I should mention memory should not be an issue. I have 1060 with 6gb

adrianastan commented 4 years ago

It's not enough to change the sampling frequency in the config file, you need to resample the LibriTTS data.

1 flow with a batch size of 1 takes around 9GB of vRAM depending on the max length of the input files, hence the padding.

VladC12 commented 4 years ago

Thanks you. I switched back to ljs and changed the sampling rate and now is well different but I will open a different issue since this is something different.

WARNING:root:NaN or Inf found in input tensor.

NVIDIA / flowtron

RuntimeError: CUDA error: device-side assert triggered #52