Zyphra / transformers_zamba2

Apache License 2.0
5 stars 0 forks source link

`use_mamba_kernels` has no effect? #2

Open RuABraun opened 3 days ago

RuABraun commented 3 days ago

System Info

After training Zyphra/Zamba2-1.2B trying to run inference on CPU but got an error:

  File "virtual_envs/neural_asr_training/lib/python3.10/site-packages/causal_conv1d/causal_conv1d_interface.py", line 57, in forward
    out = causal_conv1d_cuda.causal_conv1d_fwd(
RuntimeError: Expected x.is_cuda() to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.)

I am using use_mamba_kernels=False. However checking the code here I can't find any usage of use_mamba_kernels other than the declaration in the config?

Quentin-Anthony commented 3 days ago

This is because, at the time of the 1.2B and 2.7B release, a CPU port of the Mamba2 forward pass had not yet been contributed to HF.

For now, Zamba2 models are GPU-only until we rebase transformers_zamba2 onto upstream transformers and add the CPU mamba2 forward to modeling_zamba2.py. Current target for that is next week.