Hi @chengzeyi Would you mind to share some advice about inference optimization, such as quantization, distilling, or compilation optimization(stable-fast/TRT engine/torch compile). Or DIT model‘s Optimization direction? Any reply will be highly appreciated. Thank you very much!
Hi @chengzeyi Would you mind to share some advice about inference optimization, such as quantization, distilling, or compilation optimization(stable-fast/TRT engine/torch compile). Or DIT model‘s Optimization direction? Any reply will be highly appreciated. Thank you very much!