cindyxinyiwang / deep-latent-sequence-model

Pytorch implementation of "A Probabilistic Formulation of Unsupervised Text Style Transfer" by He. et. al. at ICLR 2020
163 stars 26 forks source link

Cant use beam size other than 1 while training #3

Open drumilT opened 4 years ago

drumilT commented 4 years ago

I get the following error at beam sizes higher than 1 while training the model, during eval steps

TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

at line https://github.com/cindyxinyiwang/deep-latent-sequence-model/blob/9d55aa02207a028b24439ee73ad60e339f376fda/src/model.py#L678

jxhe commented 4 years ago

Fixed by https://github.com/cindyxinyiwang/deep-latent-sequence-model/commit/8bbc71ff2bf856cafcf97a3174cf4869a21a9eff

drumilT commented 4 years ago

I tried this but creates a subsequent error as a list has no attribute .cpu() , furthermore if you iterate over the list and set each to element to its cpu() return value, it leads to another error related to max pooling

jxhe commented 4 years ago

I have tested scripts/yelp/train_yelp.sh with beam_size=2 without errors in the eval steps. Can you post your running log?

drumilT commented 4 years ago

Traceback (most recent call last): File "src/main.py", line 787, in main() File "src/main.py", line 784, in main train() File "src/main.py", line 718, in train val_ppl, val_bleu = eval(model, classifier, data, crit, step, hparams, eval_bleu=args.eval_bleu, valid_batch_size=args.valid_batch_size) File "src/main.py", line 415, in eval x_valid, x_mask, x_len, y_neg, y_mask, y_len, beam_size=args.beam_size, max_len=args.max_trans_len, poly_norm_m=args.poly_norm_m) File "src/model.py", line 569, in translate hyp = self.translate_sent(x, mask, [x_len[i]], y_i, y_i_mask, y_i_len, max_len=max_len, beam_size=beam_size, poly_norm_m=poly_norm_m)[0] File "src/model.py", line 637, in translate_sent x_enc, dec_init = self.encoder(x_train, x_len) File "/home/drumil/.conda/envs/dtenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, *kwargs) File "src/model.py", line 94, in forward enc_output = F.max_pool1d(enc_output.permute(0, 2, 1), kernel_size=self.hparams.max_pool_k_size, padding=(self.hparams.max_pool_k_size // 2)).permute(0, 2, 1) File "/home/drumil/.conda/envs/dtenv/lib/python3.6/site-packages/torch/_jit_internal.py", line 181, in fn return if_false(args, **kwargs) File "/home/drumil/.conda/envs/dtenv/lib/python3.6/site-packages/torch/nn/functional.py", line 457, in _max_pool1d input, kernel_size, stride, padding, dilation, ceil_mode) RuntimeError: max_pool2d_with_indices_out_cuda_frame failed with error code 0