Closed lkrisz87 closed 1 year ago
torch version: '2.0.1+cu118'
Hi there, thanks for posting this issue. I think this is probably a PyTorch 2.1 issue -- it recently worked with 1.10 (~3 months ago when it was last updated): https://github.com/rasbt/machine-learning-book/blob/main/ch15/ch15_part2.ipynb
Okay, it seems like I was mistaken here. I've assumed that the entire batch is going to get feeded to the model, and not trained character by character from the sequence. And I've tried to make the model eat the entire batch, but of course it didn't work because it was design to get batch x 1 char. It drove me crazy and I didn't read through the entire chapter...
Glad this got resolved :)
The code is here: https://github.com/rasbt/machine-learning-book/blob/main/ch15/ch15_part3.ipynb
The shape of x is (batch_size, seq_length) torch.Size([64, 40])
output shape of the embedding is [batch_size, seq_length, embed_dim]
it is unsqueezed at dim=1 => [batch_size, 1, seq_length, embed_dim]
rnn / lstm doesn't accept 4 dimensional input tensor. Is this supposed to be [batch_size, seq_length, embed_dim] or [batch_size, seq_length, 1, embed_dim]
the fc layer is feeded with the sequence output so it will produce a [batch_size, seq_length, ..., vocab_size] tensor?
What is happening here? I guess it was either overlooked chapter, or it was alright with an older version of pytorch, I don't know. I would like to ask some explanation. Until then, I'm going to find out what is going on here.
P.S. To the Author: Amazing Book!!!! So much thanks for writing it <3 <3