turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.23k stars 238 forks source link

AttributeError: 'float' object has no attribute 'item' during convert.py measurement pass #281

Closed sophosympatheia closed 5 months ago

sophosympatheia commented 5 months ago

Sometimes when using the conversion script to produce a measurement file of a model that I merged together from other models, I'll encounter the following error at the very end of the measurement pass after it measures the last model layer but before it saves the measurement json file.

Traceback (most recent call last):
  File "/home/llm/mergequant/convert.py", line 220, in <module>
    measure_quant(job, save_job, model)
  File "/home/llm/.miniconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/llm/mergequant/exllamav2/conversion/measure.py", line 389, in measure_quant
    m = measure_mlp(module, hidden_states, target_states, quantizers, cache, attn_params)
  File "/home/llm/mergequant/exllamav2/conversion/measure.py", line 190, in measure_mlp
    accuracy = test_error(module, hidden_states, target_states, cache, attn_params)
  File "/home/llm/mergequant/exllamav2/conversion/measure.py", line 106, in test_error
    return max(1e-6, 1 - (rfn_sum / rfn_count)).item()
AttributeError: 'float' object has no attribute 'item'
Error: The previous command failed. Exiting.

Any idea what I should be checking? I can provide more context if someone can steer me towards what you need.

turboderp commented 5 months ago

It doesn't make a lot of sense that this would happen sometimes. I've pushed a commit that might fix it, though.

sophosympatheia commented 5 months ago

Thanks, Turbo! I looked into it on my end too, and like you said, it seems like it's a situation that shouldn't come up when things are working properly.

When I look at some of my past measurements taken from 70b models that didn't produce the error, the accuracy for the last layer MLP is > 0.9 for even the very low bpw measurements, whereas for this model the accuracy was < 1e-6 for all bpw measurements taken on that model.layers.79.mlp. That's what triggered the error: max(1e-6, 1 - (rfn_sum / rfn_count)) returned 1e-6, which is a float and doesn't have the item() method, and I can see where that result is not expected to occur under normal circumstances.

I proceeded with quantizing the model anyway and the final result appears to be okay. ¯_(ツ)_/¯

Thank you for committing a fix to this so quickly! I greatly appreciate all your work on the exllama project.