huggingface / optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
Apache License 2.0
206 stars 60 forks source link

Unable to compile and export Stable Diffusion 2.1 #723

Open pinak-p opened 5 hours ago

pinak-p commented 5 hours ago

System Info

aws-neuronx-runtime-discovery     2.9
libneuronxla                      2.0.4115.0
neuronx-cc                        2.14.227.0+2d4f85be
neuronx-distributed               0.8.0
optimum-neuron                    0.0.24
torch-neuronx                     2.1.2.2.2.0
transformers-neuronx              0.11.351

Who can help?

@JingyaHuang

Information

Tasks

Reproduction (minimal, reproducible, runnable)

optimum-cli export neuron --model stabilityai/stable-diffusion-2-1-base --batch_size 1 --height 512 --width 512 --auto_cast matmul --auto_cast_type bf16 --num_images_per_prompt 1 . /sd_neuron/

The above command throws the below error

Keyword arguments {'subfolder': '', 'use_auth_token': None, 'trust_remote_code': False} are not expected by StableDiffusionPipeline and will be ignored. Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 18.05it/s] Applying optimized attention score computation for stable diffusion. Compiling text_encoder Using Neuron: --auto-cast matmul Using Neuron: --auto-cast-type bf16 2024-10-24 21:17:54.907338: F external/xla/xla/parse_flags_from_env.cc:224] Unknown flags in XLA_FLAGS: --xla_gpu_simplify_all_fp_conversions=false --xla_gpu_force_compilation_parallelism=8 Aborted (core dumped) Traceback (most recent call last): File "/home/ubuntu/aws_neuron_venv_pytorch/bin/optimum-cli", line 8, in sys.exit(main()) File "/home/ubuntu/aws_neuron_venv_pytorch/lib/python3.10/site-packages/optimum/commands/optimum_cli.py", line 163, in main service.run() File "/home/ubuntu/aws_neuron_venv_pytorch/lib/python3.10/site-packages/optimum/commands/export/neuronx.py", line 298, in run subprocess.run(full_command, shell=True, check=True) File "/usr/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command 'python3 -m optimum.exporters.neuron --model stabilityai/stable-diffusion-2-1-base --batch_size 1 --height 512 --width 512 --auto_cast matmul --auto_cast_type bf16 --num_images_per_prompt 1 ./sd_neuron/' returned non-zero exit status 134.

Expected behavior

The model should successfully be exported.

JingyaHuang commented 4 hours ago

Hi @pinak-p,

What's the neuron-sdk and optimum-neuron version that you are using? I just tested with the following setup, and I have compiled the models without any issue:

aws-neuronx-collectives/unknown,now 2.22.26.0-17a033bc8 amd64 [installed]
aws-neuronx-dkms/unknown,now 2.18.12.0 amd64 [installed]
aws-neuronx-runtime-lib/unknown,now 2.22.14.0-6e27b8d5b amd64 [installed]
aws-neuronx-tools/unknown,now 2.19.0.0 amd64 [installed]
aws-neuronx-runtime-discovery 2.9
diffusers                     0.30.3
libneuronxla                  2.0.4115.0
neuronx-cc                    2.15.128.0+56dc5a86
neuronx-distributed           0.9.0
optimum                       1.22.0
optimum-neuron                0.0.25.dev0
sentence-transformers         3.1.0
torch                         2.1.2
torch-neuronx                 2.1.2.2.3.0
torch-xla                     2.1.4
torchvision                   0.16.2
transformers                  4.43.2
transformers-neuronx          0.12.313
pinak-p commented 3 hours ago

Thank you, it succeeds after I upgrade the dependencies.