mit-han-lab / deepcompressor

Model Compression Toolbox for Large Language Models and Diffusion Models
Apache License 2.0
230 stars 15 forks source link

Questions about rotation #12

Open Kyeong-Joong opened 4 months ago

Kyeong-Joong commented 4 months ago

In the rotation process, why don't you apply rotations in the float64 version first and float16 again instead of only applying float16 version at once?

synxlin commented 1 week ago

Hi, We did not directly load weights in float64 since it requires 4x larger GPU memories.