Closed prows12 closed 4 years ago
음.. loss nan 되는거 보니 run_transformer.sh 로 돌리셨나요?
현재 트랜스포머는 학습이 되지 않고 있습니다.
seq2seq도 같은 에러가 나요. 혹시 pytorch 버젼 몇 쓰셨어요?
저는 1.4.0, 1.5.0, 1.6.0을 썼는데, 에러는 없었습니다. 아마 labels 경로 설정이 잘못되었거나 할것 같네요.
답변 늦어서 미안합니다. 드디어 해결했습니다. 원인은 preprocessing을 하면 새롭게 aihub_label.csv 파일이 만들어 줍니다. 그러면 거기서 num_classes를 계산하고 임베딩을 만드는데 말씀하신대로 그 부분 링크가 제대로 되지 않았습니다. data쪽인가 model_builder인가 링크 하는 곳이 있었습니다
[2020-08-27 20:20:14,206 utils.py:21 - info()] timestep: 10/70530, loss: nan, cer: 3.32, elapsed: 41.31s 0.69m 0.01h, lr: 0.00030 [2020-08-27 20:20:49,702 utils.py:21 - info()] timestep: 20/70530, loss: nan, cer: 2.88, elapsed: 35.50s 1.28m 0.02h, lr: 0.00030 [2020-08-27 20:21:18,650 utils.py:21 - info()] timestep: 30/70530, loss: nan, cer: 2.81, elapsed: 28.95s 1.76m 0.03h, lr: 0.00030 [2020-08-27 20:22:01,191 utils.py:21 - info()] timestep: 40/70530, loss: nan, cer: 2.96, elapsed: 42.54s 2.47m 0.04h, lr: 0.00030 [2020-08-27 20:22:39,461 utils.py:21 - info()] timestep: 50/70530, loss: nan, cer: 2.98, elapsed: 38.27s 3.11m 0.05h, lr: 0.00030 [2020-08-27 20:23:21,102 utils.py:21 - info()] timestep: 60/70530, loss: nan, cer: 3.09, elapsed: 41.64s 3.80m 0.06h, lr: 0.00030 [2020-08-27 20:23:53,312 utils.py:21 - info()] timestep: 70/70530, loss: nan, cer: 3.10, elapsed: 32.21s 4.34m 0.07h, lr: 0.00030 [2020-08-27 20:24:25,110 utils.py:21 - info()] timestep: 80/70530, loss: nan, cer: 3.08, elapsed: 31.80s 4.87m 0.08h, lr: 0.00030 [2020-08-27 20:25:10,588 utils.py:21 - info()] timestep: 90/70530, loss: nan, cer: 3.16, elapsed: 45.48s 5.63m 0.09h, lr: 0.00030 [2020-08-27 20:25:44,441 utils.py:21 - info()] timestep: 100/70530, loss: nan, cer: 3.13, elapsed: 33.85s 6.19m 0.10h, lr: 0.00030 Traceback (most recent call last): File "./main.py", line 111, in
main()
File "./main.py", line 107, in main
train(opt)
File "./main.py", line 86, in train
num_epochs=opt.num_epochs, teacher_forcing_ratio=opt.teacher_forcing_ratio, resume=opt.resume)
File "../kospeech/trainer/supervised_trainer.py", line 146, in train
train_queue, teacher_forcing_ratio)
File "../kospeech/trainer/supervised_trainer.py", line 231, in train_epoches
logit = model(inputs, input_lengths, targets, return_attns=False)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, *kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
return self.module(inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "../kospeech/models/acoustic/transformer/transformer.py", line 160, in forward
output, decoder_self_attns, memory_attns = self.decoder(targets, input_lengths, memory)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "../kospeech/models/acoustic/transformer/transformer.py", line 283, in forward
output = self.input_dropout(self.embedding(inputs) + self.positional_encoding(inputs.size(1)))
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, kwargs)
File "../kospeech/models/acoustic/transformer/embeddings.py", line 43, in forward
return self.embedding(inputs) self.sqrt_dim
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(input, kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/sparse.py", line 114, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 1724, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
train_queue, teacher_forcing_ratio)
File "../kospeech/trainer/supervised_trainer.py", line 231, in __train_epoches
logit = model(inputs, input_lengths, targets, return_attns=False)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py", line 143, in forward
return self.module(*inputs, *kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(input, kwargs)
File "../kospeech/models/acoustic/transformer/transformer.py", line 160, in forward
output, decoder_self_attns, memory_attns = self.decoder(targets, input_lengths, memory)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, *kwargs)
File "../kospeech/models/acoustic/transformer/transformer.py", line 283, in forward
output = self.input_dropout(self.embedding(inputs) + self.positional_encoding(inputs.size(1)))
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(input, *kwargs)
File "../kospeech/models/acoustic/transformer/embeddings.py", line 43, in forward
return self.embedding(inputs) self.sqrt_dim
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 550, in call__
result = self.forward(*input, **kwa
aihub_labels.zip
rgs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 1724, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) IndexError: index out of range in self