labmlai / annotated_deep_learning_paper_implementations

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
https://nn.labml.ai
MIT License
56.46k stars 5.79k forks source link

fix model error #243

Closed f-hy closed 5 months ago

f-hy commented 9 months ago

fix the following error:

Traceback (most recent call last):
  File "e:\data\frid\python\codes\adlpi\labml_nn\transformers\rope\__init__.py", line 231, in <module>
    _test_rotary()
  File "e:\data\frid\python\codes\adlpi\labml_nn\transformers\rope\__init__.py", line 227, in _test_rotary
    inspect(rotary_pe(x))
  File "E:\data\frid\python\p121\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
  File "e:\data\frid\python\codes\adlpi\labml_nn\transformers\rope\__init__.py", line 188, in forward
    x_rope = (x_rope * self.cos_cached[:x.shape[0]]) + (neg_half_x * self.sin_cached[:x.shape[0]])
RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 3

the reason is that the dimensions of two multiplied vectors do not match.

vpj commented 8 months ago

In cached sin and cos the first dimension is the sequence length. So it should be cos_cached[:x.shape[0]]

chen-xin-94 commented 8 months ago

In cached sin and cos the first dimension is the sequence length. So it should be cos_cached[:x.shape[0]]

That's right. I think the fix should be like this: #249

vpj commented 5 months ago

Fixed it here https://github.com/labmlai/annotated_deep_learning_paper_implementations/commit/2236f6383ce66bb25f1880512a4ad0ec8f37514a