microsoft / Phi-3CookBook

This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open sourced AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
MIT License
2.5k stars 259 forks source link

Fail to run Phi-3 Model with DirectML + ONNX Runtime in ARM64 #211

Open yuting1008 opened 1 month ago

yuting1008 commented 1 month ago

I am currently using Surface Pro 11 to reproduce AIPC_Inference.md#2-use-directml--onnx-runtime-to-run-phi-3-model. I run the following commands and fail at python build.py --use_dml. My final goal is to run my own fine-tuned model on Snapdragon NPU.

winget instalAIPC_Inference.md#2-use-directml--onnx-runtime-to-run-phi-3-modell --id=Kitware.CMake  -e

git clone https://github.com/microsoft/onnxruntime.git

cd .\onnxruntime\

./build.bat --build_shared_lib --skip_tests --parallel --use_dml --config Release

cd ../

git clone https://github.com/microsoft/onnxruntime-genai.git

cd .\onnxruntime-genai\

mkdir ort

cd ort

mkdir include

mkdir lib

copy ..\onnxruntime\include\onnxruntime\core\providers\dml\dml_provider_factory.h ort\include

copy ..\onnxruntime\include\onnxruntime\core\session\onnxruntime_c_api.h ort\include

copy ..\onnxruntime\build\Windows\Release\Release\*.dll ort\lib

copy ..\onnxruntime\build\Windows\Release\Release\onnxruntime.lib ort\lib

python build.py --use_dml

The error message is too long to be pasted here. But I think the root cause is as follows:

C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.2032.0_x64__qbz5n2kfra8p0\libs\python312.lib : warning LNK4272: 程式庫電腦類 型 'x64
' 與目標電腦類型 'ARM64' 發生衝突 [C:\Users\mtcsr\onnxruntime-genai\build\Windows\RelWithDebInfo\src\python\python.vcxproj]
C:\Users\mtcsr\onnxruntime-genai\build\Windows\RelWithDebInfo\src\python\RelWithDebInfo\onnxruntime_genai.cp312-win_amd64.pyd : fatal error LNK1120
: 144 個無法解析的外部符號 [C:\Users\mtcsr\onnxruntime-genai\build\Windows\RelWithDebInfo\src\python\python.vcxproj]
Traceback (most recent call last):
  File "C:\Users\mtcsr\onnxruntime-genai\build.py", line 650, in <module>
    build(arguments, environment)
  File "C:\Users\mtcsr\onnxruntime-genai\build.py", line 582, in build
    util.run(make_command, env=env)
  File "C:\Users\mtcsr\onnxruntime-genai\tools\python\util\run.py", line 56, in run
    completed_process = subprocess.run(
                        ^^^^^^^^^^^^^^^
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.2032.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['C:\\Program Files\\CMake\\bin\\cmake.EXE', '--build', 'C:\\Users\\mtcsr\\onnxruntime-genai\\build\\Windows\\RelWithDebInfo', '--config', 'RelWithDebInfo']' returned non-zero exit status 1.

Please let me know if there is any solution. Thank you!

leestott commented 1 month ago

Hi @yuting1008

It looks like you're encountering a conflict between the x64 and ARM64 architectures or your simply not running the command from a Developer Command Prompt for Visual Studio where cmake is installed

This is a common issue when trying to build on different architectures.

Here are a few steps to help resolve this:

Ensure Consistent Architecture: Make sure that all your tools and dependencies are targeting the same architecture (ARM64 in this case). You might need to reinstall Python and other dependencies for ARM64.

Update Build Commands: Adjust your build commands to explicitly target ARM64. You can do this by adding the --arch ARM64 flag to your build commands.

Recheck Build Commands: Ensure you are building ONNX Runtime with the correct flags for ARM64.

./build.bat --build_shared_lib --skip_tests --parallel --use_dml --config Release --arch ARM64

Python Environment: Confirm that your Python installation is for ARM64. Reinstall if necessary.

winget install Python.Python.3.12-arm64

Check Dependencies: Verify that all dependencies, especially those related to DirectML and ONNX Runtime, are compatible with ARM64. steps for setting up your environment to run the Phi-3 model on a Surface Pro 11 with Snapdragon NPU:

Step 1: Set Up Your Development Environment

Install Required Tools:

CMake: Install using winget.

winget install --id=Kitware.CMake

Git: Ensure Git is installed for cloning repositories.

Step 2: Clone ONNX Runtime Repository

Clone the ONNX Runtime Repository:

git clone https://github.com/microsoft/onnxruntime.git
cd onnxruntime

Step 3: Build ONNX Runtime with DirectML

Build the ONNX Runtime: Open a Developer Command Prompt for Visual Studio (make sure you use the Developer Prompt NOT a Command Prompt ensure it targets ARM64).

Run the build script:

./build.bat --build_shared_lib --skip_tests --parallel --use_dml --config Release

Step 4: Set Up ONNX Runtime GenAI

Clone the ONNX Runtime GenAI Repository:

cd ..
git clone https://github.com/microsoft/onnxruntime-genai.git
cd onnxruntime-genai
mkdir ort
cd ort
mkdir include
mkdir lib

Copy Necessary Files:
Copy headers and libraries from ONNX Runtime build to the ort directory:

```bash
copy ..\onnxruntime\include\onnxruntime\core\providers\dml\dml_provider_factory.h ort\include
copy ..\onnxruntime\include\onnxruntime\core\session\onnxruntime_c_api.h ort\include
copy ..\onnxruntime\build\Windows\Release\Release\*.dll ort\lib
copy ..\onnxruntime\build\Windows\Release\Release\onnxruntime.lib ort\lib

Step 5: Build ONNX Runtime GenAI

Build the Project:

Ensure you are still in the Developer Command Prompt for Visual Studio

python build.py --use_dml

Troubleshooting Tips:

Check for Architecture Conflicts: Ensure all tools and dependencies are targeting ARM64. You might need to re-install Python for ARM64 if it’s x64.

Environment Variables: Ensure environment variables like CUDA_HOME are set correctly (if applicable).

Step 6: Run Your Fine-Tuned Model

Prepare Your Model:

Ensure your fine-tuned model is in ONNX format.

Run Inference: Use the provided scripts to run inference with your model on the Snapdragon NPU.

Testing and Validation

Run Tests: Ensure your setup works correctly by running test scripts and validating outputs.

Optimize: Tune your model and setup for performance.

wmjjmwwmj commented 1 week ago

Thanks, it helps a lot!