turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.69k stars 283 forks source link

[REQUEST] Convert.py: Option to skip measurement when setting 8.0/8.0 #673

Open Originalimoc opened 1 week ago

Originalimoc commented 1 week ago

Problem

Still doing mesurement when set to 8.0 bpw.

Solution

Skip the measurement/generate a dummy meaurement file.

Alternatives

No response

Explanation

What's the point of measurement if you're using 8.0 on all layers anyway? Or is there any ignored/acceptable loss threshold will cause lower bpw like 5~6 to be used even 8 is set?

Examples

No response

Additional context

No response

Acknowledgements