turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.2k stars 236 forks source link

Typo in conversion/qparams.py #306

Closed 4PiR2 closed 5 months ago

4PiR2 commented 5 months ago

Hello! In conversion/qparams.py Line 92 you mistyped += as ++. This error has been there since the initial commit.

        total_bits += groups * 16                           # q_scale_max
        total_bits += groups * (16 + 16)                    # q_groups
        total_bits += groups * columns * self.scale_bits    # q_scale
        total_bits ++ rows * 32                             # q_invperm        <--  your typo
turboderp commented 5 months ago

This would have caused a slight miscalculation in the total bitrate. Very minor so likely no significant impact, but definitely should be fixed. Thanks for noticing.