Question about quantize time for custom flux transformer

mit-han-lab / deepcompressor

Model Compression Toolbox for Large Language Models and Diffusion Models

Apache License 2.0

237 stars 17 forks source link

Question about quantize time for custom flux transformer #24

Open chuck-ma opened 1 week ago

chuck-ma commented 1 week ago

I'm currently using H800 to do Smooth Quantization for my custom flux transformer. I'm wondering how long it would take to finish quantization. I have been quantizing for 20 minutes, but the progress bar is still empty.

python -m deepcompressor.app.diffusion.ptq configs/model/flux.1-custom.yaml configs/svdquant/int4.yaml --save-model /root/autodl-tmp/flux.1-custom-svdquant-int4

adhikjoshi commented 1 week ago

Do share update if it did work,

chuck-ma commented 1 week ago

Well, it will take 70 hours to quantize. But I have no money. Any idea about how to speed up ? @synxlin @bobboli

dome272 commented 1 week ago

Also encountering this taking 12 hours on an H100

synxlin commented 3 hours ago

Hi, @chuck-ma @dome272 ,

We are working on improving our codebase to support fast calibration without online activation generation. We'll keep this issue updated.