spro / practical-pytorch

Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained
MIT License
4.53k stars 1.11k forks source link

[seq2seq translate] - cannot unsqueeze empty tensor #108

Open frankShih opened 6 years ago

frankShih commented 6 years ago

Hi all,

I'm trying to run the seq 2seq model. (seq2seq-translation-batched.ipnb) My environment is python 3.6.4, torch 0.4.0

And I make some modification:

  1. disable anything related to CUDA (CPU-only
  2. in class Attn, forward function change return F.softmax(attn_energies).unsqueeze(1) to return F.softmax(attn_energies, dim=1).unsqueeze(1) (I cannot run the code without adding dim param)
  3. in class Attn, score function change all energy = hidden.dot(encoder_output) //dot function to energy = hidden.mm(encoder_output.t()) //matmul function (still, cannot run without this change)

Then I failed in Putting it all together part following is the log:

C:\ProgramData\Miniconda3\lib\site-packages\ipykernel_launcher.py:32: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
C:\Users\han_shih.ASUS\Documents\projects\practical-pytorch\seq2seq-translation\masked_cross_entropy.py:40: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  log_probs_flat = functional.log_softmax(logits_flat)
C:\ProgramData\Miniconda3\lib\site-packages\ipykernel_launcher.py:41: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
C:\ProgramData\Miniconda3\lib\site-packages\ipykernel_launcher.py:42: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
C:\ProgramData\Miniconda3\lib\site-packages\ipykernel_launcher.py:48: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
2m 7s (- 1057m 39s) (1 0%) 5.4052
2m 7s (- 527m 55s) (2 0%) 5.3642
2m 7s (- 351m 21s) (3 0%) 5.4186
2m 7s (- 263m 4s) (4 0%) 5.2963
2m 7s (- 210m 6s) (5 1%) 5.3928
2m 7s (- 174m 48s) (6 1%) 5.3735
2m 7s (- 149m 33s) (7 1%) 5.4015
2m 7s (- 130m 39s) (8 1%) 5.3151
2m 7s (- 115m 55s) (9 1%) 5.3790
2m 7s (- 104m 9s) (10 2%) 5.3289
C:\ProgramData\Miniconda3\lib\site-packages\ipykernel_launcher.py:4: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  after removing the cwd from sys.path.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-29-c2ca81d0dbe4> in <module>()
     33 
     34     if epoch % evaluate_every == 0:
---> 35         evaluate_randomly()
     36 
     37     if epoch % plot_every == 0:

<ipython-input-26-676896b10e4f> in evaluate_randomly()
      1 def evaluate_randomly():
      2     [input_sentence, target_sentence] = random.choice(pairs)
----> 3     evaluate_and_show_attention(input_sentence, target_sentence)

<ipython-input-28-47a4358672f3> in evaluate_and_show_attention(input_sentence, target_sentence)
      1 def evaluate_and_show_attention(input_sentence, target_sentence=None):
----> 2     output_words, attentions = evaluate(input_sentence)
      3     output_sentence = ' '.join(output_words)
      4     print('>', input_sentence)
      5     if target_sentence is not None:

<ipython-input-25-52047cca3292> in evaluate(input_seq, max_length)
     12 
     13     # Run through encoder
---> 14     encoder_outputs, encoder_hidden = encoder(input_batches, input_lengths, None)
     15 
     16     # Create starting vectors for decoder

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    489             result = self._slow_forward(*input, **kwargs)
    490         else:
--> 491             result = self.forward(*input, **kwargs)
    492         for hook in self._forward_hooks.values():
    493             hook_result = hook(self, input, result)

<ipython-input-14-798698fdab6b> in forward(self, input_seqs, input_lengths, hidden)
     15         embedded = self.embedding(input_seqs)
     16         packed = torch.nn.utils.rnn.pack_padded_sequence(embedded, input_lengths)
---> 17         outputs, hidden = self.gru(packed, hidden)
     18         outputs, output_lengths = torch.nn.utils.rnn.pad_packed_sequence(outputs) # unpack (back to padded)
     19         outputs = outputs[:, :, :self.hidden_size] + outputs[:, : ,self.hidden_size:] # Sum bidirectional outputs

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    489             result = self._slow_forward(*input, **kwargs)
    490         else:
--> 491             result = self.forward(*input, **kwargs)
    492         for hook in self._forward_hooks.values():
    493             hook_result = hook(self, input, result)

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\modules\rnn.py in forward(self, input, hx)
    190             flat_weight=flat_weight
    191         )
--> 192         output, hidden = func(input, self.all_weights, hx, batch_sizes)
    193         if is_packed:
    194             output = PackedSequence(output, batch_sizes)

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\_functions\rnn.py in forward(input, *fargs, **fkwargs)
    321             func = decorator(func)
    322 
--> 323         return func(input, *fargs, **fkwargs)
    324 
    325     return forward

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\_functions\rnn.py in forward(input, weight, hidden, batch_sizes)
    241             input = input.transpose(0, 1)
    242 
--> 243         nexth, output = func(input, hidden, weight, batch_sizes)
    244 
    245         if batch_first and not variable_length:

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\_functions\rnn.py in forward(input, hidden, weight, batch_sizes)
     84                 l = i * num_directions + j
     85 
---> 86                 hy, output = inner(input, hidden[l], weight[l], batch_sizes)
     87                 next_hidden.append(hy)
     88                 all_output.append(output)

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\_functions\rnn.py in forward(input, hidden, weight, batch_sizes)
    154 
    155             if flat_hidden:
--> 156                 hidden = (inner(step_input, hidden[0], *weight),)
    157             else:
    158                 hidden = inner(step_input, hidden, *weight)

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\_functions\rnn.py in GRUCell(input, hidden, w_ih, w_hh, b_ih, b_hh)
     54         return state(gi, gh, hidden) if b_ih is None else state(gi, gh, hidden, b_ih, b_hh)
     55 
---> 56     gi = F.linear(input, w_ih, b_ih)
     57     gh = F.linear(hidden, w_hh, b_hh)
     58     i_r, i_i, i_n = gi.chunk(3, 1)

C:\ProgramData\Miniconda3\lib\site-packages\torch\nn\functional.py in linear(input, weight, bias)
    992         return torch.addmm(bias, input, weight.t())
    993 
--> 994     output = input.matmul(weight.t())
    995     if bias is not None:
    996         output += bias

RuntimeError: cannot unsqueeze empty tensor

Please give me some suggestion, thanks.

frankShih commented 6 years ago

Does it have something to do with my python version? (I'm using miniconda3

aayushee commented 6 years ago

Hi

Were you able to find a solution for the unsqueeze error during evaluation?

frankShih commented 6 years ago

Hi @aayushee

I modified my code based on https://github.com/yanwii/seq2seq, please take a look (it is a demo of chatbot based on seq2seq model with beam search)

KshitizLohia commented 5 years ago

Hey. Need help here. Facing same issue

Andreasksalk commented 5 years ago

The problem is when you are initializing your data it is not sorted correctly, however torchtext can do this quite easily for you.

I have done it (with the SST dataset) in the LSTM RNN model in må respo here, look how i initilize the data sets and my forward pass.

Github respo (The project is undergoing so there might be quite a lot of updates and small bugs some places that is fixed in the next couple of days): https://github.com/s124265/NLP-DL-Project

Shandilya21 commented 5 years ago

Anyone got the answer for this issue? I have similar issue: Anyone can resolve this issues? Screenshot from 2019-06-18 16-06-37