dvmazur / mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops
MIT License
2.28k stars 223 forks source link

Having issue loading my HQQ quantized model #35

Open BeichenHuang opened 3 months ago

BeichenHuang commented 3 months ago

Hi! I am trying to load my HQQ quantized model using the offloading strategy, but I have problem in the model safetensor files. I notice that in your HQQ quantized model safetensor files, the weights, taking layer 0 expert 1 as an example, are saved like: 956eff2810816630e5aebacff688349 But I use the the code from official HQQ websit, the saved model is only one .pt file: How to splite the weight into all thoses components?