mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.08k stars 1.56k forks source link

[Bug] mlc_llm packge runs into FileNotFoundError: [WinError 2] The system cannot find the file specified #2828

Closed DK013 closed 1 month ago

DK013 commented 2 months ago

🐛 Bug

While trying to package models for android with this config:

{
    "device": "android",
    "model_list": [
        {
            "model": "HF://mlc-ai/Phi-3-mini-4k-instruct-q4f16_1-MLC",
            "estimated_vram_bytes": 4250586449,
            "model_id": "Phi-3-mini-4k-instruct-q4f16_1-MLC"
        },
        {
            "model": "HF://mlc-ai/Qwen2-1.5B-Instruct-q4f16_1-MLC",
            "estimated_vram_bytes": 3980990464,
            "model_id": "Qwen2-1.5B-Instruct-q4f16_1-MLC"

        }
    ]
}

the model sources are MLC format, so no need for conversion. and when I run mlc_llm package, after downloading and compiling the models, lib.tar is generated and config dump is created. The error pops up after that. Here are the Logs:

[2024-08-21 09:49:39] INFO package.py:154: Dump the app config below to "dist\bundle\mlc-app-config.json":
{
  "model_list": [
    {
      "model_id": "Phi-3-mini-4k-instruct-q4f16_1-MLC",
      "model_lib": "phi3_q4f16_1_5a9dfbccbb0147e0e063927839645159",
      "model_url": "https://huggingface.co/mlc-ai/Phi-3-mini-4k-instruct-q4f16_1-MLC",
      "estimated_vram_bytes": 4250586449
    },
    {
      "model_id": "Qwen2-1.5B-Instruct-q4f16_1-MLC",
      "model_lib": "qwen2_q4f16_1_2e221f430380225c03990ad24c3d030e",
      "model_url": "https://huggingface.co/mlc-ai/Qwen2-1.5B-Instruct-q4f16_1-MLC",
      "estimated_vram_bytes": 3980990464
    }
  ]
}
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\ProgramData\miniconda3\Scripts\mlc_llm.exe\__main__.py", line 7, in <module>
  File "C:\ProgramData\miniconda3\Lib\site-packages\mlc_llm\__main__.py", line 53, in main
    cli.main(sys.argv[2:])
  File "C:\ProgramData\miniconda3\Lib\site-packages\mlc_llm\cli\package.py", line 64, in main
    package(
  File "C:\ProgramData\miniconda3\Lib\site-packages\mlc_llm\interface\package.py", line 355, in package
    validate_model_lib(
  File "C:\ProgramData\miniconda3\Lib\site-packages\mlc_llm\interface\package.py", line 209, in validate_model_lib
    cc.create_staticlib(lib_path, tar_list)
  File "C:\ProgramData\miniconda3\Lib\site-packages\tvm\contrib\ndk.py", line 108, in create_staticlib
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\miniconda3\Lib\subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\ProgramData\miniconda3\Lib\subprocess.py", line 1538, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] The system cannot find the file specified

Environment

MasterJH5574 commented 2 months ago

By reading the source code (https://github.com/apache/tvm/blob/541f9c280c567b63630229bc03855d43fc6811af/python/tvm/contrib/ndk.py#L101-L106), it looks to me that the file llvm-ar under the same directory of $TVM_NDK_CC is not found on your side. Could you help check if $TVM_NDK_CC is properly set as in documentation or if the file llvm-ar exists?

As a reference example, on my side

>>> echo $TVM_NDK_CC
/Users/masterjh5574/Library/Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/bin/aarch64-linux-android24-clang

>>> /Users/masterjh5574/Library/Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/bin/llvm-ar --version
LLVM (http://llvm.org/):
  LLVM version 18.0.1
  Optimized build.
DK013 commented 2 months ago
>>> echo $TVM_NDK_CC
/Users/masterjh5574/Library/Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/bin/aarch64-linux-android24-clang

>>> /Users/masterjh5574/Library/Android/sdk/ndk/27.0.11718014/toolchains/llvm/prebuilt/darwin-x86_64/bin/llvm-ar --version
LLVM (http://llvm.org/):
  LLVM version 18.0.1
  Optimized build.

Thanks for pointing this out. My TVM_NDK_CC had a wrong value similar to TVM_NDK_CC: <path> that I didn't notice. fixing that passed the breaking point from before. But there'[s a new error.

[2024-08-25 08:32:41] INFO package.py:211: Creating lib from ['C:\\Users\\DK\\AppData\\Local\\mlc_llm\\model_lib\\f10bb3a7c5ffc894d40d946fe0528867.tar', 'C:\\Users\\DK\\AppData\\Local\\mlc_llm\\model_lib\\32b77923e5ce45a6001ac73424864101.tar']
[2024-08-25 08:32:41] INFO package.py:212: Validating the library dist\lib\libmodel_android.a
[2024-08-25 08:32:41] INFO package.py:213: List of available model libs packaged: ['phi3_q4f16_1_5a9dfbccbb0147e0e063927839645159', 'qwen2_q4f16_1_2e221f430380225c03990ad24c3d030e'], if we have '-' in the model_lib string, it will be turned into '_'
[2024-08-25 08:32:41] INFO package.py:256: Validation pass
[2024-08-25 08:32:41] INFO package.py:270: Moving "dist\lib\libmodel_android.a" to "build\lib\libmodel_android.a"
[2024-08-25 08:32:41] INFO package.py:274: Building mlc4j
info: downloading component 'rust-std' for 'aarch64-linux-android'
info: installing component 'rust-std' for 'aarch64-linux-android'
 21.4 MiB /  21.4 MiB (100 %)  18.4 MiB/s in  1s ETA:  0s
[2024-08-25 08:32:54] INFO prepare_libs.py:91: Entering "D:\Docs\Personal\Projects\Python\Projects\mlc-llm\android\MLCChat\build" for MLC LLM and tvm4j build.
[2024-08-25 08:32:54] INFO prepare_libs.py:95: Set TVM_SOURCE_DIR to "C:\ProgramData\miniconda3\Lib\site-packages\tvm\"
[2024-08-25 08:32:54] INFO prepare_libs.py:23: Running cmake
[2024-08-25 08:32:54] INFO prepare_libs.py:49: Using ninja in windows, make sure you installed ninja in conda
-- The C compiler identification is Clang 18.0.1
-- The CXX compiler identification is Clang 18.0.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Users/DK/AppData/Local/Android/Sdk/ndk/27.0.12077973/toolchains/llvm/prebuilt/windows-x86_64/bin/clang.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Users/DK/AppData/Local/Android/Sdk/ndk/27.0.12077973/toolchains/llvm/prebuilt/windows-x86_64/bin/clang++.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Hide private symbols
-- TVM_SOURCE_DIR: C:\ProgramData\miniconda3\Lib\site-packages\tvm\
CMake Error at D:/Docs/Personal/Projects/Python/Projects/mlc-llm/CMakeLists.txt:65 (add_subdirectory):
  The source directory

    C:/ProgramData/miniconda3/Lib/site-packages/tvm

  does not contain a CMakeLists.txt file.

-- system-nameAndroid
CMake Deprecation Warning at D:/Docs/Personal/Projects/Python/Projects/mlc-llm/3rdparty/tokenizers-cpp/msgpack/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.

-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
CMake Deprecation Warning at D:/Docs/Personal/Projects/Python/Projects/mlc-llm/3rdparty/tokenizers-cpp/sentencepiece/CMakeLists.txt:15 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.

-- VERSION: 0.2.00
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
CMake Error at D:/Docs/Personal/Projects/Python/Projects/mlc-llm/CMakeLists.txt:72 (tvm_file_glob):
  Unknown CMake command "tvm_file_glob".

-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "D:\Docs\Personal\Projects\Python\Projects\mlc-llm\android\mlc4j\prepare_libs.py", line 120, in <module>
    main(parsed.mlc_llm_source_dir)
  File "D:\Docs\Personal\Projects\Python\Projects\mlc-llm\android\mlc4j\prepare_libs.py", line 102, in main
    run_cmake(mlc_llm_source_dir / "android" / "mlc4j")
  File "D:\Docs\Personal\Projects\Python\Projects\mlc-llm\android\mlc4j\prepare_libs.py", line 51, in run_cmake
    subprocess.run(cmd, check=True, env=os.environ)
  File "C:\ProgramData\miniconda3\Lib\subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['cmake', 'D:\\Docs\\Personal\\Projects\\Python\\Projects\\mlc-llm\\android\\mlc4j', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_TOOLCHAIN_FILE=C:\\Users\\DK\\AppData\\Local\\Android\\Sdk\\ndk\\27.0.12077973\\build\\cmake\\android.toolchain.cmake', '-DCMAKE_INSTALL_PREFIX=.', '-DCMAKE_CXX_FLAGS="-O3"', '-DANDROID_ABI=arm64-v8a', '-DANDROID_NATIVE_API_LEVEL=android-24', '-DANDROID_PLATFORM=android-24', '-DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=ON', '-DANDROID_STL=c++_static', '-DUSE_HEXAGON_SDK=OFF', '-DMLC_LLM_INSTALL_STATIC_LIB=ON', '-DCMAKE_SKIP_INSTALL_ALL_DEPENDENCY=ON', '-DUSE_OPENCL=ON', '-DUSE_OPENCL_ENABLE_HOST_PTR=ON', '-DUSE_CUSTOM_LOGGING=ON', '-G', 'Ninja']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\ProgramData\miniconda3\Scripts\mlc_llm.exe\__main__.py", line 7, in <module>
  File "C:\ProgramData\miniconda3\Lib\site-packages\mlc_llm\__main__.py", line 53, in main
    cli.main(sys.argv[2:])
  File "C:\ProgramData\miniconda3\Lib\site-packages\mlc_llm\cli\package.py", line 64, in main
    package(
  File "C:\ProgramData\miniconda3\Lib\site-packages\mlc_llm\interface\package.py", line 361, in package
    build_android_binding(mlc_llm_source_dir, output)
  File "C:\ProgramData\miniconda3\Lib\site-packages\mlc_llm\interface\package.py", line 275, in build_android_binding
    subprocess.run([sys.executable, mlc4j_path / "prepare_libs.py"], check=True, env=os.environ)
  File "C:\ProgramData\miniconda3\Lib\subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['C:\\ProgramData\\miniconda3\\python.exe', WindowsPath('D:/Docs/Personal/Projects/Python/Projects/mlc-llm/android/mlc4j/prepare_libs.py')]' returned non-zero exit status 1.

Seems like, there's config issue for cmake and it keeps giving me depreciation warning less version than cmake 3.5 but I have 3.20.2. Can't wrap my head around this. Can you help please?

Thanks

MasterJH5574 commented 2 months ago

@DK013 Based on the log you shared,

-- TVM_SOURCE_DIR: C:\ProgramData\miniconda3\Lib\site-packages\tvm\
CMake Error at D:/Docs/Personal/Projects/Python/Projects/mlc-llm/CMakeLists.txt:65 (add_subdirectory):
  The source directory

    C:/ProgramData/miniconda3/Lib/site-packages/tvm

  does not contain a CMakeLists.txt file.

, my guess is that the env variable TVM_SOURCE_DIR you set is not right. We need to point it to the source code (e.g., D:/Docs/Personal/Projects/Python/Projects/mlc-llm/3rdparty/tvm) rather than the tvm Python package location. Would you like to correct this env variable and try again?

MasterJH5574 commented 1 month ago

Closing due to inactivity for now. You are more than welcome to open a new issue if the problem persists.