vae_encoder_gpu-dml_footprints.json file not found when converting stable diffusion xl base model

AshD commented 2 weeks ago

Describe the bug python stable_diffusion_xl.py --model_id=stabilityai/stable-diffusion-xl-base-1.0 --optimize /home/ash/ai/lib/python3.12/site-packages/diffusers/models/transformers/transformer_2d.py:34: FutureWarning: Transformer2DModelOutput is deprecated and will be removed in version 1.0.0. Importing Transformer2DModelOutput from diffusers.models.transformer_2d is deprecated and this will be removed in a future version. Please use from diffusers.models.modeling_outputs import Transformer2DModelOutput, instead. deprecate("Transformer2DModelOutput", "1.0.0", deprecation_message) Download stable diffusion PyTorch pipeline... Loading pipeline components...: 100%|█████████████████████████████████████████████████████| 7/7 [00:00<00:00, 9.47it/s]

Optimizing vae_encoder [2024-06-18 20:35:41,419] [INFO] [run.py:138:run_engine] Running workflow default_workflow [2024-06-18 20:35:41,422] [INFO] [engine.py:986:save_olive_config] Saved Olive config to cache/default_workflow/olive_config.json [2024-06-18 20:35:41,425] [WARNING] [accelerator_creator.py:182:_check_execution_providers] The following execution providers are not supported: 'DmlExecutionProvider' by the device: 'gpu' and will be ignored. Please consider installing an onnxruntime build that contains the relevant execution providers. [2024-06-18 20:35:41,425] [INFO] [accelerator_creator.py:224:create_accelerators] Running workflow on accelerator specs: gpu-cpu [2024-06-18 20:35:41,425] [INFO] [engine.py:109:initialize] Using cache directory: cache/default_workflow [2024-06-18 20:35:41,425] [INFO] [engine.py:265:run] Running Olive on accelerator: gpu-cpu [2024-06-18 20:35:41,425] [INFO] [engine.py:1085:_create_system] Creating target system ... [2024-06-18 20:35:41,425] [INFO] [engine.py:1088:_create_system] Target system created in 0.000057 seconds [2024-06-18 20:35:41,425] [INFO] [engine.py:1097:_create_system] Creating host system ... [2024-06-18 20:35:41,425] [INFO] [engine.py:1100:_create_system] Host system created in 0.000053 seconds [2024-06-18 20:35:41,453] [INFO] [engine.py:867:_run_pass] Running pass convert:OnnxConversion [2024-06-18 20:35:41,453] [INFO] [engine.py:901:_run_pass] Loaded model from cache: 3_OnnxConversion-45ce4523-e3495161 from cache/default_workflow/runs [2024-06-18 20:35:41,453] [INFO] [engine.py:867:_run_pass] Running pass optimize:OrtTransformersOptimization [2024-06-18 20:35:41,454] [INFO] [transformer_optimization.py:169:validate_search_point] CPUExecutionProvider does not support float16 very well, please avoid to use float16. [2024-06-18 20:35:41,454] [WARNING] [engine.py:873:_run_pass] Invalid search point, prune [2024-06-18 20:35:41,454] [WARNING] [engine.py:850:_run_passes] Skipping evaluation as model was pruned [2024-06-18 20:35:41,454] [WARNING] [engine.py:437:run_no_search] Flow ['convert', 'optimize'] is pruned due to failed or invalid config for pass 'optimize' [2024-06-18 20:35:41,454] [INFO] [engine.py:364:run_accelerator] Save footprint to footprints/vae_encoder_gpu-cpu_footprints.json. [2024-06-18 20:35:41,454] [INFO] [engine.py:282:run] Run history for gpu-cpu: [2024-06-18 20:35:41,457] [INFO] [engine.py:570:dump_run_history] run history: +------------------------------------+-------------------+----------------+----------------+-----------+ | model_id | parent_model_id | from_pass | duration_sec | metrics | +====================================+===================+================+================+===========+ | 45ce4523 | | | | | +------------------------------------+-------------------+----------------+----------------+-----------+ | 3_OnnxConversion-45ce4523-e3495161 | 45ce4523 | OnnxConversion | 6.64365 | | +------------------------------------+-------------------+----------------+----------------+-----------+ [2024-06-18 20:35:41,457] [INFO] [engine.py:297:run] No packaging config provided, skip packaging artifacts Traceback (most recent call last): File "/home/ash/ai/Olive/examples/directml/stable_diffusion_xl/stable_diffusion_xl.py", line 635, in main() File "/home/ash/ai/Olive/examples/directml/stable_diffusion_xl/stable_diffusion_xl.py", line 601, in main optimize( File "/home/ash/ai/Olive/examples/directml/stable_diffusion_xl/stable_diffusion_xl.py", line 374, in optimize with footprints_file_path.open("r") as footprint_file: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/pathlib.py", line 1015, in open return io.open(self, mode, buffering, encoding, errors, newline) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: '/home/ash/ai/Olive/examples/directml/stable_diffusion_xl/footprints/vae_encoder_gpu-dml_footprints.json'

To Reproduce Run python stable_diffusion_xl.py --model_id=stabilityai/stable-diffusion-xl-base-1.0 --optimize

Other information

OS: Ubuntu 22.04 olive-ai 0.6.2 onnx 1.16.1 onnxruntime 1.18.0

jambayk commented 1 week ago

hi,

from the logs it appears that the dml workflow is being skipped since you are running it in a linux environment without dml ep. Since the workflow contains an evaluator, it is checking for the presence of the dml ep and not finding it.

can you try again by removing the evaluator": "common_evaluator" part from the config json?

AshD commented 1 week ago

Tried it.

Optimizing vae_encoder Traceback (most recent call last): File "/home/ash/ai/Olive/examples/directml/stable_diffusion_xl/stable_diffusion_xl.py", line 635, in main() File "/home/ash/ai/Olive/examples/directml/stable_diffusion_xl/stable_diffusion_xl.py", line 601, in main optimize( File "/home/ash/ai/Olive/examples/directml/stable_diffusion_xl/stable_diffusion_xl.py", line 369, in optimize olive_run(olive_config) File "/home/ash/ai/lib/python3.12/site-packages/olive/workflows/run/run.py", line 284, in run run_config = RunConfig.parse_file_or_obj(run_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ash/ai/lib/python3.12/site-packages/olive/common/config_utils.py", line 120, in parse_file_or_obj return cls.parse_obj(file_or_obj) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ash/ai/lib/python3.12/site-packages/pydantic/v1/main.py", line 526, in parse_obj return cls(**obj) ^^^^^^^^^^ File "/home/ash/ai/lib/python3.12/site-packages/pydantic/v1/main.py", line 341, in init raise validation_error pydantic.v1.error_wrappers.ValidationError: 4 validation errors for RunConfig engine Evaluator common_evaluator not found in evaluators (type=value_error) passes -> convert Invalid engine (type=value_error) passes -> optimize Invalid engine (type=value_error) passes -> optimize_cuda Invalid engine (type=value_error)

jambayk commented 1 week ago

looks like you only removed it from the "evaluators" section. Sorry I was unclear. please remove the evaluator field under "engine".

WickedHorse commented 2 days ago

I had this same problem. i ran the pip install -r requirements.txt at the projects root but there was another requirements.txt file C:\Users\Cole\olive\Olive\examples\stable_diffusion. i re ran the command then reissued python stable_diffusion.py --optimize and that seemed to run through.

microsoft / Olive

vae_encoder_gpu-dml_footprints.json file not found when converting stable diffusion xl base model #1202