Infatoshi / fcc-intro-to-llms

553 stars 229 forks source link

RuntimeError: The size of tensor a (64) must match the size of tensor b (65) at non-singleton dimension 2 #12

Open michael554466 opened 2 months ago

michael554466 commented 2 months ago

When I try to run the chatbot.py, it spits out this error when trying to generate the response:

Traceback (most recent call last): File "L:\Projects\Python\GitHub-Repos\fcc-intro-to-llms\chatbot.py", line 199, in generated_chars = decode(m.generate(context.unsqueeze(0), max_new_tokens=150)[0].tolist()) File "L:\Projects\Python\GitHub-Repos\fcc-intro-to-llms\chatbot.py", line 176, in generate logits, loss = self.forward(index_cond) File "L:\Projects\Python\GitHub-Repos\fcc-intro-to-llms\chatbot.py", line 156, in forward x = self.blocks(x) # (B,T,C) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\container.py", line 217, in forward input = module(input) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "L:\Projects\Python\GitHub-Repos\fcc-intro-to-llms\chatbot.py", line 121, in forward y = self.sa(x) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "L:\Projects\Python\GitHub-Repos\fcc-intro-to-llms\chatbot.py", line 88, in forward out = torch.cat([h(x) for h in self.heads], dim=-1) # (B, T, F) -> (B, T, [h1, h1, h1, h1, h2, h2, h2, h2, h3, h3, h3, h3]) File "L:\Projects\Python\GitHub-Repos\fcc-intro-to-llms\chatbot.py", line 88, in out = torch.cat([h(x) for h in self.heads], dim=-1) # (B, T, F) -> (B, T, [h1, h1, h1, h1, h2, h2, h2, h2, h3, h3, h3, h3]) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "L:\Projects\Python\cuda\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, **kwargs) File "L:\Projects\Python\GitHub-Repos\fcc-intro-to-llms\chatbot.py", line 67, in forward wei = wei.masked_fill(self.tril[:T, :T] == 0, float('-inf')) # (B, T, T) RuntimeError: The size of tensor a (64) must match the size of tensor b (65) at non-singleton dimension 2

Any idea on how to fix this plz? Gpu: 4070 12gb torch: 2.3.0+cu121

ahmadsm1 commented 2 months ago

Did you get this working by any chance?