[Bug] AttributeError: function 'TVMGetLastPythonError' not found. Did you mean: 'TVMAPISetLastPythonError'?

David-Sharma commented 1 year ago

🐛 Bug

(base) C:\Users\dmsha\dev\mlc>python -m mlc_llm.build --model Llama-2-7b-chat-hf --target vulkan --quantiz ation q4f16_1 --llvm-mingw path/to/llvm-mingw ** Compiling models under Windows 11 has not been an issue for me. This started about 1 week ago and I have not been able to resolve. A reload has not cleared this problem.

To Reproduce

Steps to reproduce the behavior:

1.(base) C:\Users\dmsha\dev\mlc>python -m mlc_llm.build --model Llama-2-7b-chat-hf --target vulkan --quantiz ation q4f16_1 --llvm-mingw path/to/llvm-mingw 1. 1.

Using path "dist\models\Llama-2-7b-chat-hf" for model "Llama-2-7b-chat-hf" Target configured: vulkan -keys=vulkan,gpu -max_num_threads=256 -max_shared_memory_per_block=32768 -max_threads_per_block=256 -supports_16bit_buffer=1 -supports_8bit_buffer=1 -supports_float16=1 -supports_float32=1 -supports_int16=1 -supports_int32=1 -supports_int8=1 -supports_storage_buffer_storage_class=1 -thread_warp_size=1

The following 2 lines reproduces themselves many times: [14:37:25] D:\a\package\package\tvm\src\node\reflection.cc:109: AttributeError: relax.expr.Var object has no attributed shard_dim Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

[14:37:25] D:\a\package\package\tvm\src\node\reflection.cc:109: AttributeError: relax.expr.Var object has no attributed shard_strategy Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

[14:37:27] D:\a\package\package\tvm\src\relax\ir\expr.cc:174: Check failed: index < tuple_info->fields.size() (197 vs. 197) : Index out of bounds: Tuple params is of size 197, and cannot be accessed with index 197 Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "C:\Users\dmsha\miniconda3\Lib\site-packages\mlc_llm\build.py", line 46, in main() File "C:\Users\dmsha\miniconda3\Lib\site-packages\mlc_llm\build.py", line 42, in main core.build_model_from_args(parsed_args) File "C:\Users\dmsha\miniconda3\Lib\site-packages\mlc_llm\core.py", line 648, in build_model_from_args new_params = utils.convert_weights(param_manager, params, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\dmsha\miniconda3\Lib\site-packages\mlc_llm\utils.py", line 229, in convert_weights mod_transform = relax.transform.LazyTransformParams()(mod_transform) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\dmsha\miniconda3\Lib\site-packages\tvm\ir\transform.py", line 238, in call return _ffi_transform_api.RunPass(self, mod) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\dmsha\miniconda3\Lib\site-packages\tvm_ffi_ctypes\packed_func.py", line 239, in call raise_last_ffi_error() File "C:\Users\dmsha\miniconda3\Lib\site-packages\tvm_ffi\base.py", line 415, in raise_last_ffi_error _LIB.TVMGetLastPythonError.restype = ctypes.c_void_p ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\dmsha\miniconda3\Lib\ctypes__init.py", line 389, in getattr func = self.getitem(name) ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\dmsha\miniconda3\Lib\ctypes\init.py", line 394, in getitem__ func = self._FuncPtr((name_or_ordinal, self)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: function 'TVMGetLastPythonError' not found. Did you mean: 'TVMAPISetLastPythonError'?

As before, I expect a folder containing the weights and library files

Environment

Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): Vulkan
Operating system (e.g. Ubuntu/Windows/MacOS/...): Windows 11
Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...) PC RTX3060
How you installed MLC-LLM (conda, source): Conda
How you installed TVM-Unity (pip, source):pip
Python version (e.g. 3.10): V3.11.4
GPU driver version (if applicable):
CUDA/cuDNN version (if applicable):
TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):

ibinfo().items()))" USE_NVTX: OFF USE_GTEST: AUTO SUMMARIZE: OFF USE_IOS_RPC: OFF USE_MSC: OFF USE_ETHOSU: CUDA_VERSION: NOT-FOUND USE_LIBBACKTRACE: AUTO DLPACK_PATH: 3rdparty/dlpack/include USE_TENSORRT_CODEGEN: OFF USE_THRUST: OFF USE_TARGET_ONNX: OFF USE_AOT_EXECUTOR: ON BUILD_DUMMY_LIBTVM: OFF USE_CUDNN: OFF USE_TENSORRT_RUNTIME: OFF USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF USE_CCACHE: AUTO USE_ARM_COMPUTE_LIB: OFF USE_CPP_RTVM: USE_OPENCL_GTEST: /path/to/opencl/gtest USE_MKL: OFF USE_PT_TVMDSOOP: OFF MLIR_VERSION: NOT-FOUND USE_CLML: OFF USE_STACKVM_RUNTIME: OFF USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF ROCM_PATH: /opt/rocm USE_DNNL: OFF USE_VITIS_AI: OFF USE_MLIR: OFF USE_RCCL: OFF USE_LLVM: llvm-config --link-static USE_VERILATOR: OFF USE_TF_TVMDSOOP: OFF USE_THREADS: ON USE_MSVC_MT: OFF BACKTRACE_ON_SEGFAULT: OFF USE_GRAPH_EXECUTOR: ON USE_NCCL: OFF USE_ROCBLAS: OFF GIT_COMMIT_HASH: 30b4fa3c13fc80d5c9151a9dc445d22c57ced3e0 USE_VULKAN: ON USE_RUST_EXT: OFF USE_CUTLASS: OFF USE_CPP_RPC: OFF USE_HEXAGON: OFF USE_CUSTOM_LOGGING: OFF USE_UMA: OFF USE_FALLBACK_STL_MAP: OFF USE_SORT: ON USE_RTTI: ON GIT_COMMIT_TIME: 2023-10-17 21:33:54 -0700 USE_HEXAGON_SDK: /path/to/sdk USE_BLAS: none USE_ETHOSN: OFF USE_LIBTORCH: OFF USE_RANDOM: ON USE_CUDA: OFF USE_COREML: OFF USE_AMX: OFF BUILD_STATIC_RUNTIME: OFF USE_CMSISNN: OFF USE_KHRONOS_SPIRV: OFF USE_CLML_GRAPH_EXECUTOR: OFF USE_TFLITE: OFF USE_HEXAGON_GTEST: /path/to/hexagon/gtest PICOJSON_PATH: 3rdparty/picojson USE_OPENCL_ENABLE_HOST_PTR: OFF INSTALL_DEV: OFF USE_PROFILER: ON USE_NNPACK: OFF LLVM_VERSION: 17.0.2 USE_OPENCL: OFF COMPILER_RT_PATH: 3rdparty/compiler-rt RANG_PATH: 3rdparty/rang/include USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF USE_OPENMP: OFF USE_BNNS: OFF USE_CUBLAS: OFF USE_METAL: OFF USE_MICRO_STANDALONE_RUNTIME: OFF USE_HEXAGON_EXTERNAL_LIBS: OFF USE_ALTERNATIVE_LINKER: AUTO USE_BYODT_POSIT: OFF USE_HEXAGON_RPC: OFF USE_MICRO: OFF DMLC_PATH: 3rdparty/dmlc-core/include INDEX_DEFAULT_I64: ON USE_RELAY_DEBUG: OFF USE_RPC: ON USE_TENSORFLOW_PATH: none TVM_CLML_VERSION: USE_MIOPEN: OFF USE_ROCM: OFF USE_PAPI: OFF USE_CURAND: OFF TVM_CXX_COMPILER_PATH: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.35.32215/bin/HostX64/x64/cl.exe HIDE_PRIVATE_SYMBOLS: OFF
Any other relevant information: python -c "import tvm; print(tvm.file)" C:\Users\dmsha\miniconda3\Lib\site-packages\tvm__init__.py

python -c "import tvm; print(tvm._ffi.base._LIB)" <CDLL 'C:\Users\dmsha\miniconda3\Lib\site-packages\tvm\tvm.dll', handle 7ffa41c20000 at 0x230eabbea10>

Additional context

A blank folder is C:\Users\dmsha\dev\mlc\dist\Llama-2-7b-chat-hf-q4f16_1 is created

junrushao commented 1 year ago

this is an issue from https://github.com/apache/tvm/pull/15596. seems there have been multiple reports in MLC LLM. CC @Lunderberg the original author of this PR if you could take a look

junrushao commented 1 year ago

Would you mind sharing a Python stacktrace to this error msg?

The following 2 lines reproduces themselves many times:
[14:37:25] D:\a\package\package\tvm\src\node\reflection.cc:109: AttributeError: relax.expr.Var object has no attributed shard_dim
Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.
[14:37:25] D:\a\package\package\tvm\src\node\reflection.cc:109: AttributeError: relax.expr.Var object has no attributed shard_strategy
Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

This will help us find out where this .shard_dim is used exactly in Python codebase

Lunderberg commented 1 year ago

Hmm. The AttributeError: function 'TVMGetLastPythonError' not found. seems rather odd. Did you recompile TVM after pulling?

tqchen commented 1 year ago

I added a note in another thread. By default MSVC do not export the function that are explicitly marked as exported. So in this case it is because TVMGetLastPythonError is not marked via TVM_DLL in declaration header

Lunderberg commented 1 year ago

Ah, Windows seems to be the key point, as Windows doesn't expose symbols by default, and so the extern "C" is insufficient. I had been thinking about exposing to non-tvm libraries, and missed the exposure to other portions of tvm on windows.

Can you try with this hotfix applied?

Edit: Hehe, good timing @tqchen, and I like that it looks like a simple fix. :grin:

tqchen commented 1 year ago

Another fix that likely can resolve this issue of build https://github.com/apache/tvm/pull/15973

junrushao commented 1 year ago

A few related fixes:

Sing-Li commented 1 year ago

Please pardon my (possible) related comment. The Note section above. Downloading and "rename it to zstd.dll and copy to the same folder as tvm.dll" part of the instruction --- is almost impossible for new users to accomplish because they have to understand how conda relates to python, and where miniconda keeps its python site libs before they can find "the same folder as tvm.dll". I hope one of the above fixes will make sure that zstd.dll is always included as part of the tvm bundle/nightly 🙏

junrushao commented 1 year ago

That's a nightmare I was trying hard to solve :((

The zstd.dll dependency is introduced by LLVM which we use to generate efficient code, but somehow it is not shipped by default in some Windows distributions...I was trying to static link it into libtvm.dll, but it failed miserably in many different ways...CC @tqchen if you have a better idea :((

tqchen commented 1 year ago

wonder if wheel bunlder can ship that like other ones

David-Sharma commented 1 year ago

Can you solve this by a modification of the docs? (It's how I do understand it)

Original docs: It is likely zstd, a dependency to LLVM, was missing. Please download the precompiled binary, rename it to zstd.dll and copy to the same folder as tvm.dll

Modified: It is likely zstd, a dependency to LLVM, was missing. Please download the precompiled binary, rename it to zstd.dll and copy to the same folder as tvm.dll. Hint - Perform a search for "tvm.dll" and identify the folder in which the path includes the name of the current environment eg. mlc-chat-venv. Copy zstd.dll to that folder.

tqchen commented 1 year ago

@David-Sharma great suggestions, do you mind open a PR

David-Sharma commented 1 year ago

@tqchen Submitted https://github.com/mlc-ai/mlc-llm/issues/1135

tqchen commented 1 year ago

@David-Sharma do you mind directly fork and update the respective files under https://github.com/mlc-ai/mlc-llm/tree/main/docs

junrushao commented 11 months ago

I think It's fixed now :))

junrushao commented 11 months ago

Let's consolidate this to #1135. Let me know if it works now btw!

mlc-ai / mlc-llm