microsoft / Olive

Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
https://microsoft.github.io/Olive/
MIT License
1.61k stars 171 forks source link

Mistral optimization(GPU) for a locally saved model, Failed to run Olive on gpu-cuda. #1341

Open tjinjin95 opened 2 months ago

tjinjin95 commented 2 months ago

Describe the bug Failed to run Olive on gpu-cuda.

To Reproduce Download https://huggingface.co/mistralai/Mistral-7B-v0.1/tree/main to folder: D:\windowsAI\HFModel\Mistral-7B-v01 Follow readme: https://github.com/microsoft/Olive/tree/main/examples/mistral Running step: python mistral.py --optimize --config mistral_fp16_optimize.json --model_id D:\windowsAI\HFModel\Mistral-7B-v01 If the method is not right, would you help to list the right methods.

my virtual environment pip list: Package Version Editable project location


accelerate 0.33.0 aiohappyeyeballs 2.4.0 aiohttp 3.10.5 aiosignal 1.3.1 alembic 1.13.2 annotated-types 0.7.0 attrs 24.2.0 certifi 2024.7.4 charset-normalizer 3.3.2 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 contourpy 1.2.1 cycler 0.12.1 datasets 2.21.0 Deprecated 1.2.14 dill 0.3.8 evaluate 0.4.2 filelock 3.15.4 flatbuffers 24.3.25 fonttools 4.53.1 frozenlist 1.4.1 fsspec 2024.6.1 greenlet 3.0.3 huggingface-hub 0.24.6 humanfriendly 10.0 idna 3.8 inquirerpy 0.3.4 Jinja2 3.1.4 joblib 1.4.2 kiwisolver 1.4.5 lightning-utilities 0.11.6 Mako 1.3.5 MarkupSafe 2.1.5 matplotlib 3.9.2 mpmath 1.3.0 multidict 6.0.5 multiprocess 0.70.16 networkx 3.3 neural_compressor 3.0 numpy 1.26.4 olive-ai 0.7.0 D:\windowsAI\Olive onnx 1.16.2 onnxconverter-common 1.14.0 onnxruntime-directml 1.19.0 onnxruntime_extensions 0.12.0 onnxruntime-gpu 1.19.0 opencv-python-headless 4.10.0.84 optimum 1.21.4 optuna 3.6.1 packaging 24.1 pandas 2.2.2 pfzy 0.3.4 pillow 10.4.0 pip 24.2 prettytable 3.11.0 prompt_toolkit 3.0.47 protobuf 3.20.2 psutil 6.0.0 py-cpuinfo 9.0.0 pyarrow 17.0.0 pycocotools 2.0.8 pydantic 2.8.2 pydantic_core 2.20.1 pyparsing 3.1.4 pyreadline3 3.4.1 python-dateutil 2.9.0.post0 pytz 2024.1 PyYAML 6.0.2 regex 2024.7.24 requests 2.32.3 safetensors 0.4.4 schema 0.7.7 scikit-learn 1.5.1 scipy 1.14.1 sentencepiece 0.2.0 setuptools 73.0.1 six 1.16.0 skl2onnx 1.17.0 SQLAlchemy 2.0.32 sympy 1.13.2 tabulate 0.9.0 tf2onnx 1.16.1 threadpoolctl 3.5.0 tokenizers 0.19.1 torch 2.4.0 torchaudio 2.4.0 torchmetrics 1.4.1 torchvision 0.19.0 tqdm 4.66.5 transformers 4.43.4 typing_extensions 4.12.2 tzdata 2024.1 urllib3 2.2.2 wcwidth 0.2.13 wrapt 1.16.0 xxhash 3.5.0 yarl 1.9.4 Expected behavior generate a optimized model

Olive config --config mistral_fp16_optimize.json

Olive logs `(mistral_env) D:\windowsAI\Olive\examples\mistral>python mistral.py --optimize --config mistral_fp16_optimize.json --model_id D:\windowsAI\HFModel\Mistral-7B-v01

optimized_model_dir is:D:\windowsAI\Olive\examples\mistral\models\convert-optimize-perf_tuning\mistral_fp16_gpu-cuda_model Optimizing D:\windowsAI\HFModel\Mistral-7B-v01 [2024-08-31 17:50:42,659] [INFO] [run.py:138:run_engine] Running workflow default_workflow [2024-08-31 17:50:42,704] [INFO] [cache.py:51:init] Using cache directory: D:\windowsAI\Olive\examples\mistral\cache\default_workflow [2024-08-31 17:50:42,757] [INFO] [engine.py:1013:save_olive_config] Saved Olive config to D:\windowsAI\Olive\examples\mistral\cache\default_workflow\olive_config.json [2024-08-31 17:50:42,846] [INFO] [accelerator_creator.py:224:create_accelerators] Running workflow on accelerator specs: gpu-cuda [2024-08-31 17:50:42,888] [INFO] [engine.py:275:run] Running Olive on accelerator: gpu-cuda [2024-08-31 17:50:42,888] [INFO] [engine.py:1110:_create_system] Creating target system ... [2024-08-31 17:50:42,889] [INFO] [engine.py:1113:_create_system] Target system created in 0.000000 seconds [2024-08-31 17:50:42,889] [INFO] [engine.py:1122:_create_system] Creating host system ... [2024-08-31 17:50:42,891] [INFO] [engine.py:1125:_create_system] Host system created in 0.000000 seconds passes is [('convert', {}), ('optimize', {}), ('perf_tuning', {})] [2024-08-31 17:50:43,102] [INFO] [engine.py:877:_run_pass] Running pass convert:OptimumConversion Framework not specified. Using pt to export the model. [2024-08-31 17:50:54,785] [ERROR] [engine.py:976:_run_pass] Pass run failed. Traceback (most recent call last): File "D:\windowsAI\Olive\olive\engine\engine.py", line 964, in _run_pass output_model_config = host.run_pass(p, input_model_config, output_model_path, pass_search_point) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\systems\local.py", line 30, in run_pass output_model = the_pass.run(model, output_model_path, point) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\passes\olive_pass.py", line 206, in run output_model = self._run_for_config(model, config, output_model_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\passes\onnx\optimum_conversion.py", line 96, in _run_for_config export_optimum_model(model.model_name_or_path, output_model_path, extra_args) File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\onnx__main__.py", line 248, in main_export task = TasksManager.infer_task_from_model(model_name_or_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1680, in infer_task_from_model task = cls._infer_task_from_model_name_or_path(model, subfolder=subfolder, revision=revision) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1593, in _infer_task_from_model_name_or_path raise RuntimeError( RuntimeError: Cannot infer the task from a local directory yet, please specify the task manually (masked-im, automatic-speech-recognition, fill-mask, object-detection, text2text-generation, text-to-audio, image-to-image, audio-xvector, image-segmentation, mask-generation, zero-shot-object-detection, image-to-text, semantic-segmentation, question-answering, feature-extraction, conversational, token-classification, text-classification, audio-classification, depth-estimation, sentence-similarity, zero-shot-image-classification, audio-frame-classification, multiple-choice, text-generation, image-classification, stable-diffusion, stable-diffusion-xl). [2024-08-31 17:50:55,193] [WARNING] [engine.py:370:run_accelerator] Failed to run Olive on gpu-cuda. Traceback (most recent call last): File "D:\windowsAI\Olive\olive\engine\engine.py", line 349, in run_accelerator output_footprint = self.run_no_search( ^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\engine\engine.py", line 441, in run_no_search should_prune, signal, model_ids = self._run_passes( ^^^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\engine\engine.py", line 814, in _run_passes model_config, model_id, output_model_hash = self._run_pass( ^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\engine\engine.py", line 964, in _run_pass output_model_config = host.run_pass(p, input_model_config, output_model_path, pass_search_point) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\systems\local.py", line 30, in run_pass output_model = the_pass.run(model, output_model_path, point) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\passes\olive_pass.py", line 206, in run output_model = self._run_for_config(model, config, output_model_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\Olive\olive\passes\onnx\optimum_conversion.py", line 96, in _run_for_config export_optimum_model(model.model_name_or_path, output_model_path, extra_args) File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\onnx__main__.py", line 248, in main_export task = TasksManager.infer_task_from_model(model_name_or_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1680, in infer_task_from_model task = cls._infer_task_from_model_name_or_path(model, subfolder=subfolder, revision=revision) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1593, in _infer_task_from_model_name_or_path raise RuntimeError( RuntimeError: Cannot infer the task from a local directory yet, please specify the task manually (masked-im, automatic-speech-recognition, fill-mask, object-detection, text2text-generation, text-to-audio, image-to-image, audio-xvector, image-segmentation, mask-generation, zero-shot-object-detection, image-to-text, semantic-segmentation, question-answering, feature-extraction, conversational, token-classification, text-classification, audio-classification, depth-estimation, sentence-similarity, zero-shot-image-classification, audio-frame-classification, multiple-choice, text-generation, image-classification, stable-diffusion, stable-diffusion-xl). [2024-08-31 17:50:55,199] [INFO] [engine.py:292:run] Run history for gpu-cuda: [2024-08-31 17:50:55,347] [INFO] [engine.py:587:dump_run_history] run history: +------------+-------------------+-------------+----------------+-----------+ | model_id | parent_model_id | from_pass | duration_sec | metrics | +============+===================+=============+================+===========+ | d03e43d3 | | | | | +------------+-------------------+-------------+----------------+-----------+ [2024-08-31 17:50:55,378] [INFO] [engine.py:307:run] No packaging config provided, skip packaging artifacts`

Other information

Additional context None

jambayk commented 2 months ago

Looks like optimum export is failing on local model.

could you try by replacing the "convert" config using this?

{
      "type": "OnnxConversion",
      "target_opset": 17,
      "torch_dtype": "float32"
  }