turboderp / exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.66k stars 214 forks source link

test_inference.py : AttributeError: module 'exllamav2_ext' has no attribute 'rms_norm' #302

Closed DFuller134 closed 9 months ago

DFuller134 commented 9 months ago
root@0d4e7b58dda1:/exllamav2# PYTHONPATH=exllamav2 python test_inference.py -m /model -p "Once upon a time,"
 -- Model: /model
 -- Options: ['rope_scale 1.0', 'rope_alpha 1.0']
 -- Loading model...
 -- Loading tokenizer...
 -- Warmup...
Traceback (most recent call last):
  File "/exllamav2/test_inference.py", line 62, in <module>
    generator.warmup()
  File "/exllamav2/exllamav2/generator/base.py", line 36, in warmup
    self.model.forward(input_ids, cache = None, input_mask = None, preprocess_only = True)
  File "/exllamav2/exllamav2/model.py", line 330, in forward
    return self._forward(input_ids = input_ids,
  File "/exllamav2/exllamav2/model.py", line 422, in _forward
    x = module.forward(x, cache = cache, attn_mask = attn_mask, past_len = past_len)
  File "/exllamav2/exllamav2/attn.py", line 213, in forward
    return self.forward_torch(hidden_states, cache, attn_mask, past_len, intermediates)
  File "/exllamav2/exllamav2/attn.py", line 433, in forward_torch
    post_norm = self.input_layernorm.forward(hidden_states)
  File "/exllamav2/exllamav2/rmsnorm.py", line 58, in forward
    ext_c.rms_norm(hidden_states, self.weight, norm, self.variance_epsilon)
AttributeError: module 'exllamav2_ext' has no attribute 'rms_norm'
DFuller134 commented 9 months ago

Wrong repo.