kyegomez / MoE-Mamba

Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
https://discord.gg/GYbXvDGevY
MIT License
83 stars 5 forks source link

[BUG] I tried to run example.py as is but it fails #4

Closed arelkeselbri closed 7 months ago

arelkeselbri commented 7 months ago

I installed pip install moe-mamba

I ran poetry install and then poetry run python example.py

Why did I get RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x4 and 512x512) as the following ?

Traceback (most recent call last): File "/home/marcelo/MoE-Mamba/example.py", line 2, in from moe_mamba.model import MoEMamba File "/home/marcelo/MoE-Mamba/moe_mamba/init.py", line 1, in from moe_mamba.model import MoEMambaBlock, MoEMamba File "/home/marcelo/MoE-Mamba/moe_mamba/model.py", line 4, in from zeta.nn import FeedForward, MambaBlock, RMSNorm File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/init.py", line 28, in from zeta.nn import File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/nn/init.py", line 1, in from zeta.nn.attention import File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/nn/attention/init.py", line 14, in from zeta.nn.attention.mixture_attention import ( File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/nn/attention/mixture_attention.py", line 8, in from zeta.models.vit import exists File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/models/init.py", line 3, in from zeta.models.andromeda import Andromeda File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/models/andromeda.py", line 4, in from zeta.structs.auto_regressive_wrapper import AutoregressiveWrapper File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/structs/init.py", line 4, in from zeta.structs.local_transformer import LocalTransformer File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/structs/local_transformer.py", line 8, in from zeta.nn.modules import feedforward_network File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/nn/modules/init.py", line 47, in from zeta.nn.modules.mlp_mixer import MLPMixer File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/nn/modules/mlp_mixer.py", line 145, in output = mlp_mixer(example_input) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/nn/modules/mlp_mixer.py", line 125, in forward x = mixer_block(x) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, kwargs) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/nn/modules/mlp_mixer.py", line 63, in forward y = self.tokens_mlp(y) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/zeta/nn/modules/mlp_mixer.py", line 30, in forward y = self.dense1(x) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, **kwargs) File "/home/marcelo/.cache/pypoetry/virtualenvs/moe-mamba-ehhCoYub-py3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 118, in forward return F.linear(input, self.weight, self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x4 and 512x512)

Upvote & Fund

Fund with Polar

github-actions[bot] commented 7 months ago

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

kyegomez commented 7 months ago

@arelkeselbri update zetascale

pip3 install -U zetascale