Cant use beam size other than 1 while training

drumilT commented 4 years ago

I get the following error at beam sizes higher than 1 while training the model, during eval steps

TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

at line https://github.com/cindyxinyiwang/deep-latent-sequence-model/blob/9d55aa02207a028b24439ee73ad60e339f376fda/src/model.py#L678

jxhe commented 4 years ago

Fixed by https://github.com/cindyxinyiwang/deep-latent-sequence-model/commit/8bbc71ff2bf856cafcf97a3174cf4869a21a9eff

drumilT commented 4 years ago

I tried this but creates a subsequent error as a list has no attribute .cpu() , furthermore if you iterate over the list and set each to element to its cpu() return value, it leads to another error related to max pooling

jxhe commented 4 years ago

I have tested scripts/yelp/train_yelp.sh with beam_size=2 without errors in the eval steps. Can you post your running log?

drumilT commented 4 years ago

Traceback (most recent call last): File "src/main.py", line 787, in main() File "src/main.py", line 784, in main train() File "src/main.py", line 718, in train val_ppl, val_bleu = eval(model, classifier, data, crit, step, hparams, eval_bleu=args.eval_bleu, valid_batch_size=args.valid_batch_size) File "src/main.py", line 415, in eval x_valid, x_mask, x_len, y_neg, y_mask, y_len, beam_size=args.beam_size, max_len=args.max_trans_len, poly_norm_m=args.poly_norm_m) File "src/model.py", line 569, in translate hyp = self.translate_sent(x, mask, [x_len[i]], y_i, y_i_mask, y_i_len, max_len=max_len, beam_size=beam_size, poly_norm_m=poly_norm_m)[0] File "src/model.py", line 637, in translate_sent x_enc, dec_init = self.encoder(x_train, x_len) File "/home/drumil/.conda/envs/dtenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, *kwargs) File "src/model.py", line 94, in forward enc_output = F.max_pool1d(enc_output.permute(0, 2, 1), kernel_size=self.hparams.max_pool_k_size, padding=(self.hparams.max_pool_k_size // 2)).permute(0, 2, 1) File "/home/drumil/.conda/envs/dtenv/lib/python3.6/site-packages/torch/_jit_internal.py", line 181, in fn return if_false(args, **kwargs) File "/home/drumil/.conda/envs/dtenv/lib/python3.6/site-packages/torch/nn/functional.py", line 457, in _max_pool1d input, kernel_size, stride, padding, dilation, ceil_mode) RuntimeError: max_pool2d_with_indices_out_cuda_frame failed with error code 0

cindyxinyiwang / deep-latent-sequence-model

Cant use beam size other than 1 while training #3