Closed gdebayan closed 1 year ago
Also, for what its worth --- this is the error I get in 'cpu' mode:
(h3) debayan@lambda-femtosense-2:~/h3$ PYTHONPATH=$(pwd)/H3 python3 -i H3/examples/generate_text_h3.py --ckpt H3-125M/model.pt --prompt "Hungry Hungry Hippos: Towards Language Modeling With State" --dmodel 768 --nlayer 12 --attn-layer-idx 6 --nheads=12
args.ckpt H3-125M/model.pt
Traceback (most recent call last):
File "/home/debayan/h3/H3/examples/generate_text_h3.py", line 60, in <module>
output_ids = model.generate(input_ids=input_ids, max_length=max_length,
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/flash_attn-0.2.8-py3.10-linux-x86_64.egg/flash_attn/utils/generation.py", line 150, in generate
output = decode(input_ids, self, max_length, top_k=top_k, top_p=top_p,
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/flash_attn-0.2.8-py3.10-linux-x86_64.egg/flash_attn/utils/generation.py", line 107, in decode
logits = model(input_ids, inference_params=inference_params).logits[:, -1]
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/debayan/h3/H3/src/models/ssm_seq.py", line 187, in forward
hidden_states = self.backbone(input_ids, position_ids=position_ids,
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/debayan/h3/H3/src/models/ssm_seq.py", line 142, in forward
hidden_states, residual = layer(hidden_states, residual, mixer_kwargs=mixer_kwargs)
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/flash_attn-0.2.8-py3.10-linux-x86_64.egg/flash_attn/modules/block.py", line 106, in forward
hidden_states = self.norm1(residual.to(dtype=self.norm1.weight.dtype))
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 190, in forward
return F.layer_norm(
File "/home/debayan/miniconda3/envs/h3/lib/python3.10/site-packages/torch/nn/functional.py", line 2515, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
>>>
Thanks for the error reports! What GPU are you running on? Can you share the output of nvidia-smi
and your CUDA toolkit version (e.g., nvcc --version
?)
Hey @DanFu09, apologies for the delayed response (Github did not notify me of your comment)
Looks like it was a simple issue wrt not having enough GPU memory. Works fine now! thanks! Reference: https://discuss.pytorch.org/t/cuda-error-cublas-status-not-initialized-when-calling-cublascreate-handle/125450/2
Hey There!
I followed the steps mentioned in the README.MD and when I try running
generate_text_h3.py
I get the following error.Some Notes: 1) I installed https://github.com/HazyResearch/flash-attention from
source
. 2) As I gotimport errors
, I installed the libraries one-by-one.NOTE: I installed the versions that were default 3) I'm using a Linux Ubuntu machine
I'll be happy to share any other info (regarding versioning, etc.) you might have.
Here is the error trace for your reference: