NVIDIA / Megatron-LM

Ongoing research training transformer models at scale
9.23k stars 2.08k forks source link

[BUG] RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead. #875

Open janelu9 opened 1 week ago

janelu9 commented 1 week ago
 File "/mnt/e/nlp/Megatron-LM/megatron/legacy/model/transformer.py", line 697, in forward
[rank0]:     query_layer = query_layer.view(query_layer.size(0), query_layer.size(1), -1, self.hidden_size_per_attention_head)
[rank0]:                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
janelu9 commented 1 week ago
