Closed richardburleigh closed 8 months ago
FWIW, I'm seeing the same error over here, using pytorch 2.2.1
@richardburleigh, actually I'm seeing something slightly different than you describe - With SEQ_LEN = 1024, the error is line 86 as noted. With SEQ_LEN = 512 the error is actually on line 106 sample = model.generate(inp[None, ...], GENERATE_LENGTH)
I'm at a bit of a loss here though, as inp[None, ...]
definitely has the shape [1, 512], and GENERATE_LENGTH is definitely 512... so I'm not sure why the error is RuntimeError: The size of tensor a (513) must match the size of tensor b (512) at non-singleton dimension 1
Changing inp to have the shape [1, 511] does not help, as the error becomes RuntimeError: The size of tensor a (511) must match the size of tensor b (512) at non-singleton dimension 1
Okay, I've got it (I think). In addition to changing SEQ_LEN = 512, in at.py the line out = torch.cat((out, sample), dim=-1)
needs to be changed to out = torch.cat((out[:, :-1], sample), dim=-1)
.
I got the same error. So for me editing out =
lets the training loop run, but there is generation being outputted at all...
@nathanielhudson @MichelNivard hanks for the issue. The error was with RMSNorm, it works now. Just got clone and or git pill
I seem to have got it running thanks!
@
Thank you for sharing this incredible work!
I speculate that it's an issue of library versions, but I'm receiving the following error when attempting to run unmodified train.py:
RuntimeError: The size of tensor a (1024) must match the size of tensor b (512) at non-singleton dimension 1
Changing the default
SEQ_LEN = 1024
to512
gives the following:RuntimeError: The size of tensor a (513) must match the size of tensor b (512) at non-singleton dimension 1
While a sequence length of 511 says:
RuntimeError: The size of tensor a (511) must match the size of tensor b (512) at non-singleton dimension 1
Full error log:
Any help would be appreciated!
Upvote & Fund