Open hungpthanh opened 7 years ago
I think transpose was used because PyTorch expects the batch_size in the second dimension, it's been a while since I have coded this. But, I have checked all the dimensions from the start to the end when I developed it. :)
Thank you so much :+1:
@Sandeep42 @hungthanhpham94 I wonder whether there is an error due to what Pytorch is expecting.
In the function train_data()
, it's written:
for i in xrange(max_sents):
_s, state_word, _ = word_attn_model(mini_batch[i,:,:].transpose(0,1), state_word)
In this way, after the .transpose(0,1)
, the resulting mini_batch matrix has size (max_tokens, batch_size).
However, the first function to be called is the self.lookup(embed)
, which is expecting a (batch_size, list_of_indeces).
If this is correct, it requires to fix up all the following code.
Why do you need transpose here _s, stateword, = word_attn_model(mini_batch[i,:,:].transpose(0,1), state_word)
and here: torch.from_numpy(main_matrix).transpose(0,1) in def pad_batch
Thanks :)