cli_demo_quantization.py does not work with latest torchao (git)

System Info / 系統信息

torch 2.4.1 / diffuser 0.30.2 / Ubuntu 22.04.4 LTS / Cuda driver 12.6

Information / 问题信息

[X] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

pip install git+https://github.com/pytorch/ao.git

python cli_demo_quantization.py --prompt "A girl riding a bike." --model_path THUDM/CogVideoX-5b --quantization_scheme fp8 --dtype bfloat16

Expected behavior / 期待表现

does not work with torchao from current git tree anymore

Traceback (most recent call last): File "/home/x/CogVideo/inference/cli_demo_quantization.py", line 26, in from torchao.float8.inference import ActivationCasting, QuantConfig, quantize_to_float8 ImportError: cannot import name 'ActivationCasting' from 'torchao.float8.inference'

it seems it got (re)moved here: https://github.com/pytorch/ao/commit/848e123e37df7e7033f26619b02562525404c2b5

Is there any up-to-date example of how to do inference with quantization?

THUDM / CogVideo

cli_demo_quantization.py does not work with latest torchao (git) #245

System Info / 系統信息

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现