mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.09k stars 1.56k forks source link

[Bug] Check failed: (it != type_key2index_.end()) is false: Cannot find type ObjectPath. Did you forget to register the node by TVM_REGISTER_NODE_TYPE #2602

Closed raj-khare closed 3 months ago

raj-khare commented 4 months ago

🐛 Bug

I'm trying to build MLC from source and then package it with the pyinstaller. However im getting the following error when i run the packaged binary:

./test
[104914] PyInstaller Bootloader 6.x
[104914] LOADER: executable file: /root/test/dist/test/test
[104914] LOADER: trying to load executable-embedded archive...
[104914] LOADER: attempting to open archive /root/test/dist/test/test
[104914] LOADER: cookie found at offset 0x1D2B4F5
[104914] LOADER: archive file: /root/test/dist/test/test
[104914] LOADER: application has onedir semantics...
[104914] LOADER: POSIX onedir process needs to set library seach path and restart itself.
[104914] LOADER: setting LD_LIBRARY_PATH=/root/test/dist/test/_internal
[104914] PyInstaller Bootloader 6.x
[104914] LOADER: executable file: /root/test/dist/test/test
[104914] LOADER: trying to load executable-embedded archive...
[104914] LOADER: attempting to open archive /root/test/dist/test/test
[104914] LOADER: cookie found at offset 0x1D2B4F5
[104914] LOADER: archive file: /root/test/dist/test/test
[104914] LOADER: application has onedir semantics...
[104914] LOADER: POSIX onedir process has already restarted itself.
[104914] LOADER: application's top-level directory: /root/test/dist/test/_internal
[104914] LOADER: looking for splash screen resources...
[104914] LOADER: splash screen resources not found.
[104914] LOADER: loading Python shared library: /root/test/dist/test/_internal/libpython3.10.so.1.0
[104914] LOADER: loaded functions from Python shared library.
[104914] LOADER: pre-initializing embedded python interpreter...
[104914] LOADER: creating PyConfig structure...
[104914] LOADER: initializing interpreter configuration...
[104914] LOADER: setting program name...
[104914] LOADER: setting python home path...
[104914] LOADER: setting module search paths...
[104914] LOADER: setting sys.argv...
[104914] LOADER: applying run-time options...
[104914] LOADER: starting embedded python interpreter...
[104914] LOADER: setting sys._MEIPASS
[104914] LOADER: importing modules from PKG/CArchive
[104914] LOADER: extracted struct
[104914] LOADER: running unmarshalled code object for module struct...
[104914] LOADER: extracted pyimod01_archive
[104914] LOADER: running unmarshalled code object for module pyimod01_archive...
[104914] LOADER: extracted pyimod02_importers
[104914] LOADER: running unmarshalled code object for module pyimod02_importers...
[104914] LOADER: extracted pyimod03_ctypes
[104914] LOADER: running unmarshalled code object for module pyimod03_ctypes...
[104914] LOADER: installing PYZ archive with Python modules.
[104914] LOADER: PYZ archive: PYZ-00.pyz
[104914] LOADER: running pyiboot01_bootstrap.py
[104914] LOADER: running pyi_rth_inspect.py
[104914] LOADER: running pyi_rth_pkgutil.py
[104914] LOADER: running pyi_rth_multiprocessing.py
[104914] LOADER: running pyi_rth_pkgres.py
[104914] LOADER: running pyi_rth_setuptools.py
[104914] LOADER: running test.py
********* RUNNING ************
Traceback (most recent call last):
  File "test.py", line 2, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller/loader/pyimod02_importers.py", line 419, in exec_module
  File "tvm/__init__.py", line 33, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller/loader/pyimod02_importers.py", line 419, in exec_module
  File "tvm/runtime/__init__.py", line 22, in <module>
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "PyInstaller/loader/pyimod02_importers.py", line 419, in exec_module
  File "tvm/runtime/object_path.py", line 44, in <module>
  File "tvm/_ffi/registry.py", line 69, in register
  File "tvm/_ffi/base.py", line 496, in check_call
  File "tvm/_ffi/base.py", line 481, in raise_last_ffi_error
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (3) /root/test/dist/test/_internal/libtvm.so(TVMObjectTypeKey2Index+0x5c) [0x7ff2cfcdd8ac]
  [bt] (2) /root/test/dist/test/_internal/libtvm.so(tvm::runtime::Object::TypeKey2Index(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x220) [0x7ff2cfcdd820]
  [bt] (1) /root/test/dist/test/_internal/libtvm.so(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x3d) [0x7ff2cfc4a64d]
  [bt] (0) /root/test/dist/test/_internal/libtvm.so(tvm::runtime::Backtrace[abi:cxx11]()+0x2c) [0x7ff2cfcba2ac]
  File "/root/mlc-llm/3rdparty/tvm/src/runtime/object.cc", line 165
InternalError: Check failed: (it != type_key2index_.end()) is false: Cannot find type ObjectPath. Did you forget to register the node by TVM_REGISTER_NODE_TYPE ?
[104914] Failed to execute script 'test' due to unhandled exception!
[104914] LOADER: ERROR.
[104914] LOADER: manually flushing stdout and stderr...
[104914] LOADER: cleaning up Python interpreter...
[104914] LOADER: unloading Python shared library...

To Reproduce

Steps to reproduce the behavior:

  1. Create a fresh conda env
  2. pip install --pre -U -f https://mlc.ai/wheels mlc-ai-nightly-cu121 --> to install tvm
  3. git clone --recursive https://github.com/mlc-ai/mlc-llm.git && cd mlc-llm/
  4. mkdir -p build && cd build
  5. python ../cmake/gen_cmake_config.py
  6. cmake .. && cmake --build . --parallel $(nproc) && cd ..
  7. cd python && python setup.py bdist_wheel

cmake.config:

set(TVM_SOURCE_DIR 3rdparty/tvm)
set(CMAKE_BUILD_TYPE RelWithDebInfo)
set(USE_CUDA ON)
set(USE_CUTLASS ON)
set(USE_CUBLAS ON)
set(USE_ROCM OFF)
set(USE_VULKAN OFF)
set(USE_METAL OFF)
set(USE_OPENCL OFF)
set(USE_OPENCL_ENABLE_HOST_PTR OFF)
set(USE_THRUST ON)
set(USE_FLASHINFER ON)
set(FLASHINFER_ENABLE_FP8 OFF)
set(FLASHINFER_ENABLE_BF16 OFF)
set(FLASHINFER_GEN_GROUP_SIZES 1 4 6 8)
set(FLASHINFER_GEN_PAGE_SIZES 16)
set(FLASHINFER_GEN_HEAD_DIMS 128)
set(FLASHINFER_GEN_KV_LAYOUTS 0 1)
set(FLASHINFER_GEN_POS_ENCODING_MODES 0 1)
set(FLASHINFER_GEN_ALLOW_FP16_QK_REDUCTIONS "false")
set(FLASHINFER_GEN_CASUALS "false" "true")
set(FLASHINFER_CUDA_ARCHITECTURES 89)
set(CMAKE_CUDA_ARCHITECTURES 89)

Now, I take the wheel and install in a fresh conda env (on the same machine).

Then I run pyinstaller test.spec --clean --noconfirm

test.py:

print("********* RUNNING ************")
import tvm
import os
print(os.path.dirname(tvm.__file__))

test.spec

cat test.spec
# -*- mode: python ; coding: utf-8 -*-

from PyInstaller.utils.hooks import collect_all

datas = []
binaries = []
hiddenimports = []

for pkg in ['mlc_llm', 'tvm', 'aiosqlite']:
    d,b,h = collect_all(pkg)
    datas += d
    binaries += b
    hiddenimports += h

a = Analysis(
    ['test.py'],
    pathex=[],
    binaries=binaries,
    datas=datas,
    hiddenimports=hiddenimports,
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    noarchive=False,
    optimize=0,
)
pyz = PYZ(a.pure)

exe = EXE(
    pyz,
    a.scripts,
    [],
    exclude_binaries=True,
    name='test',
    debug=True,
    bootloader_ignore_signals=False,
    strip=False,
    upx=False,
    console=True,
    disable_windowed_traceback=False,
    argv_emulation=False,
    target_arch=None,
    codesign_identity=None,
    entitlements_file=None,
)
coll = COLLECT(
    exe,
    a.binaries,
    a.datas,
    strip=False,
    upx=False,
    upx_exclude=[],
    name='test',
)

MY pyinstaller directory looks like this:

tree -L 2
.
├── _internal
│   ├── attrs-23.2.0.dist-info
│   ├── base_library.zip
│   ├── certifi
│   ├── charset_normalizer
│   ├── email_validator-2.2.0.dist-info
│   ├── httptools
│   ├── lib-dynload
│   ├── libSPIRV-Tools-shared-1d0d2694.so -> mlc_ai_nightly_cu121.libs/libSPIRV-Tools-shared-1d0d2694.so
│   ├── libbz2.so.1.0
│   ├── libc10.so -> torch/lib/libc10.so
│   ├── libc10_cuda.so -> torch/lib/libc10_cuda.so
│   ├── libcrypto.so.3
│   ├── libcublas.so.12 -> nvidia/cublas/lib/libcublas.so.12
│   ├── libcublasLt.so.12 -> nvidia/cublas/lib/libcublasLt.so.12
│   ├── libcudart.so.12 -> nvidia/cuda_runtime/lib/libcudart.so.12
│   ├── libcudnn.so.8 -> nvidia/cudnn/lib/libcudnn.so.8
│   ├── libcudnn_adv_infer.so.8 -> nvidia/cudnn/lib/libcudnn_adv_infer.so.8
│   ├── libcudnn_cnn_infer.so.8 -> nvidia/cudnn/lib/libcudnn_cnn_infer.so.8
│   ├── libcudnn_ops_infer.so.8 -> nvidia/cudnn/lib/libcudnn_ops_infer.so.8
│   ├── libcudnn_ops_train.so.8 -> nvidia/cudnn/lib/libcudnn_ops_train.so.8
│   ├── libcufft.so.11 -> nvidia/cufft/lib/libcufft.so.11
│   ├── libcupti.so.12 -> nvidia/cuda_cupti/lib/libcupti.so.12
│   ├── libcurand.so.10 -> nvidia/curand/lib/libcurand.so.10
│   ├── libcusolver.so.11 -> nvidia/cusolver/lib/libcusolver.so.11
│   ├── libcusparse.so.12 -> nvidia/cusparse/lib/libcusparse.so.12
│   ├── libcusparseLt-f80c68d1.so.0 -> torch/lib/libcusparseLt-f80c68d1.so.0
│   ├── libffi.so.8
│   ├── libflash_attn-fea7792c.so -> mlc_ai_nightly_cu121.libs/libflash_attn-fea7792c.so
│   ├── libflash_attn.so
│   ├── libfpA_intB_gemm-da16d324.so -> mlc_ai_nightly_cu121.libs/libfpA_intB_gemm-da16d324.so
│   ├── libfpA_intB_gemm.so
│   ├── libgcc_s.so.1
│   ├── libgfortran-040039e1-0352e75f.so.5.0.0 -> numpy.libs/libgfortran-040039e1-0352e75f.so.5.0.0
│   ├── libgfortran-040039e1.so.5.0.0 -> scipy.libs/libgfortran-040039e1.so.5.0.0
│   ├── libgomp-a34b3233.so.1 -> torch/lib/libgomp-a34b3233.so.1
│   ├── liblzma.so.5
│   ├── libnccl.so.2 -> nvidia/nccl/lib/libnccl.so.2
│   ├── libncursesw.so.6
│   ├── libnvJitLink.so.12 -> nvidia/nvjitlink/lib/libnvJitLink.so.12
│   ├── libnvToolsExt.so.1 -> nvidia/nvtx/lib/libnvToolsExt.so.1
│   ├── libnvrtc.so.12 -> nvidia/cuda_nvrtc/lib/libnvrtc.so.12
│   ├── libpython3.10.so.1.0
│   ├── libquadmath-96973f99-934c22de.so.0.0.0 -> numpy.libs/libquadmath-96973f99-934c22de.so.0.0.0
│   ├── libquadmath-96973f99.so.0.0.0 -> scipy.libs/libquadmath-96973f99.so.0.0.0
│   ├── libreadline.so.8
│   ├── libscipy_openblas-c128ec02.so -> scipy.libs/libscipy_openblas-c128ec02.so
│   ├── libscipy_openblas64_-99b71e71.so -> numpy.libs/libscipy_openblas64_-99b71e71.so
│   ├── libsf_error_state.so -> scipy/special/libsf_error_state.so
│   ├── libshm.so -> torch/lib/libshm.so
│   ├── libsqlite3.so.0
│   ├── libssl.so.3
│   ├── libstdc++.so.6
│   ├── libtinfo.so.6
│   ├── libtinfow.so.6
│   ├── libtorch.so -> torch/lib/libtorch.so
│   ├── libtorch_cpu.so -> torch/lib/libtorch_cpu.so
│   ├── libtorch_cuda.so -> torch/lib/libtorch_cuda.so
│   ├── libtorch_python.so -> torch/lib/libtorch_python.so
│   ├── libtvm.so
│   ├── libtvm_runtime.so
│   ├── libuuid.so.1
│   ├── libvulkan-947940a9.so.1.3.236 -> mlc_ai_nightly_cu121.libs/libvulkan-947940a9.so.1.3.236
│   ├── libz.so.1
│   ├── markupsafe
│   ├── ml_dtypes
│   ├── mlc_ai_nightly_cu121-0.15.dev404.dist-info
│   ├── mlc_ai_nightly_cu121.libs
│   ├── mlc_llm
│   ├── mlc_llm-0.1.dev1404+g437166a4.dist-info
│   ├── numpy
│   ├── numpy.libs
│   ├── nvidia
│   ├── orjson
│   ├── psutil
│   ├── pydantic_core
│   ├── regex
│   ├── safetensors
│   ├── scipy
│   ├── scipy.libs
│   ├── tiktoken
│   ├── torch
│   ├── tornado
│   ├── triton
│   ├── tvm
│   ├── ujson.cpython-310-x86_64-linux-gnu.so
│   ├── uvloop
│   ├── watchfiles
│   ├── websockets
│   ├── websockets-12.0.dist-info
│   ├── wheel-0.43.0.dist-info
│   └── yaml
└── test

34 directories, 59 files

Expected behavior

TVM should import without any problem.

Environment

Any advice to resolve this would be really helpful :)

tqchen commented 3 months ago

Please upgrade to latest mlc llm and mlc-ai per instruction