[Bug] subprocess.CalledProcessError: Command '['D:\\Program\\Anaconda3\\envs\\mlc-chat\\python.exe', '-m', 'mlc_chat.cli.check_device', 'vulkan:0']' returned non-zero exit status 3221225477.

tao-began commented 11 months ago

🐛 Bug

I used mlc llm to compile Llama-2-7b-chat-hf, but when using mlc-chat, I reported an error: subprocess.CalledProcessError: Command '['D:\Program\Anaconda3\envs\mlc-chat\python.exe', '-m', 'mlc_chat.cli.check_device', 'vulkan:0']' returned non-zero exit status 3221225477.

To Reproduce

Steps to reproduce the behavior: python sample_mlc_chat.py

Complete error reporting:

Traceback (most recent call last): File "E:\code\mlc-llm-win\mlc-llm\sample_mlc_chat.py", line 8, in cm = ChatModule(model="Llama-2-7b-chat-hf-q4f16_1") File "D:\Program\Anaconda3\envs\mlc-chat\lib\site-packages\mlc_chat\chat_module.py", line 660, in init self.device = detect_device(device) File "D:\Program\Anaconda3\envs\mlc-chat\lib\site-packages\mlc_chat\support\auto_device.py", line 27, in detect_device if _device_exists(cur_device): File "D:\Program\Anaconda3\envs\mlc-chat\lib\site-packages\mlc_chat\support\auto_device.py", line 58, in _device_exists result = subprocess.check_output(cmd, stderr=subprocess.STDOUT).decode("utf-8") File "D:\Program\Anaconda3\envs\mlc-chat\lib\subprocess.py", line 420, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "D:\Program\Anaconda3\envs\mlc-chat\lib\subprocess.py", line 524, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['D:\Program\Anaconda3\envs\mlc-chat\python.exe', '-m', 'mlc_chat.cli.check_device', 'vulkan:0']' returned non-zero exit status 3221225477.

Environment

Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): Vulkan
Operating system (e.g. Ubuntu/Windows/MacOS/...): windows
Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...) PC+RTX1650
How you installed MLC-LLM (conda, source): conda
How you installed TVM-Unity (pip, source): pip
Python version (e.g. 3.10): 3.10
GPU driver version (if applicable): NVIDIA 3.27.0.120
CUDA/cuDNN version (if applicable):V12.3
TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models): USE_NVTX: OFF USE_GTEST: AUTO SUMMARIZE: OFF USE_IOS_RPC: OFF USE_MSC: OFF USE_ETHOSU: CUDA_VERSION: NOT-FOUND USE_LIBBACKTRACE: AUTO DLPACK_PATH: 3rdparty/dlpack/include USE_TENSORRT_CODEGEN: OFF USE_THRUST: OFF USE_TARGET_ONNX: OFF USE_AOT_EXECUTOR: ON BUILD_DUMMY_LIBTVM: OFF USE_CUDNN: OFF USE_TENSORRT_RUNTIME: OFF USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF USE_CCACHE: AUTO USE_ARM_COMPUTE_LIB: OFF USE_CPP_RTVM: USE_OPENCL_GTEST: /path/to/opencl/gtest USE_MKL: OFF USE_PT_TVMDSOOP: OFF MLIR_VERSION: NOT-FOUND USE_CLML: OFF USE_STACKVM_RUNTIME: OFF USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF ROCM_PATH: /opt/rocm USE_DNNL: OFF USE_VITIS_AI: OFF USE_MLIR: OFF USE_RCCL: OFF USE_LLVM: llvm-config --link-static USE_VERILATOR: OFF USE_TF_TVMDSOOP: OFF USE_THREADS: ON USE_MSVC_MT: OFF BACKTRACE_ON_SEGFAULT: OFF USE_GRAPH_EXECUTOR: ON USE_NCCL: OFF USE_ROCBLAS: OFF GIT_COMMIT_HASH: 119fce9877001d217f3b29293bd4a344897b87ff USE_VULKAN: ON USE_RUST_EXT: OFF USE_CUTLASS: OFF USE_CPP_RPC: OFF USE_HEXAGON: OFF USE_CUSTOM_LOGGING: OFF USE_UMA: OFF USE_FALLBACK_STL_MAP: OFF USE_SORT: ON USE_RTTI: ON GIT_COMMIT_TIME: 2023-11-21 10:46:02 -0800 USE_HEXAGON_SDK: /path/to/sdk USE_BLAS: none USE_ETHOSN: OFF USE_LIBTORCH: OFF USE_RANDOM: ON USE_CUDA: OFF USE_COREML: OFF USE_AMX: OFF BUILD_STATIC_RUNTIME: OFF USE_CMSISNN: OFF USE_KHRONOS_SPIRV: OFF USE_CLML_GRAPH_EXECUTOR: OFF USE_TFLITE: OFF USE_HEXAGON_GTEST: /path/to/hexagon/gtest PICOJSON_PATH: 3rdparty/picojson USE_OPENCL_ENABLE_HOST_PTR: OFF INSTALL_DEV: OFF USE_PROFILER: ON USE_NNPACK: OFF LLVM_VERSION: 17.0.5 USE_OPENCL: OFF COMPILER_RT_PATH: 3rdparty/compiler-rt RANG_PATH: 3rdparty/rang/include USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF USE_OPENMP: OFF USE_BNNS: OFF USE_CUBLAS: OFF USE_METAL: OFF USE_MICRO_STANDALONE_RUNTIME: OFF USE_HEXAGON_EXTERNAL_LIBS: OFF USE_ALTERNATIVE_LINKER: AUTO USE_BYODT_POSIT: OFF USE_HEXAGON_RPC: OFF USE_MICRO: OFF DMLC_PATH: 3rdparty/dmlc-core/include INDEX_DEFAULT_I64: ON USE_RELAY_DEBUG: OFF USE_RPC: ON USE_TENSORFLOW_PATH: none TVM_CLML_VERSION: USE_MIOPEN: OFF USE_ROCM: OFF USE_PAPI: OFF USE_CURAND: OFF TVM_CXX_COMPILER_PATH: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.37.32822/bin/HostX64/x64/cl.exe HIDE_PRIVATE_SYMBOLS: OFF
Any other relevant information:

Additional context

E:\code\mlc-llm-win\mlc-llm\dist\Llama-2-7b-chat-hf-q4f16_1 的目录

2023/11/23 16:16
. 2023/11/23 16:16 .. 2023/11/23 16:16 18,516,480 Llama-2-7b-chat-hf-q4f16_1-vulkan.dll 2023/11/23 16:16 7,387 Llama-2-7b-chat-hf-q4f16_1-vulkan.exp 2023/11/23 16:16 14,310 Llama-2-7b-chat-hf-q4f16_1-vulkan.lib 2023/11/23 16:16 30,528,875 mod_cache_before_build.pkl 2023/11/23 16:15 params

E:\code\mlc-llm-win\mlc-llm\dist\Llama-2-7b-chat-hf-q4f16_1\params 的目录

2023/11/23 16:15
. 2023/11/23 16:15 .. 2023/11/23 16:15 21 added_tokens.json 2023/11/23 16:15 629 mlc-chat-config.json 2023/11/23 16:15 142,309 ndarray-cache.json 2023/11/23 16:14 65,536,000 params_shard_0.bin

2023/11/23 16:15 1,842,767 tokenizer.json 2023/11/23 16:15 499,723 tokenizer.model 2023/11/23 16:15 770 tokenizer_config.json

junrushao commented 11 months ago

Could you run the following command and share the error message:

python -m mlc_chat.cli.check_device vulkan:0

tao-began commented 11 months ago

Could you run the following command and share the error message:

python -m mlc_chat.cli.check_device vulkan:0

python -m mlc_chat.cli.check_device vulkan:0,the result is 1

tao-began commented 11 months ago

Is it possible that this is because I am a GTX 1650 4G graphics card, because MLC currently does not support the CPU, and then the PC's graphics card is not enough memory?

MaTwickenham commented 11 months ago

My device is 3060TI 8G and I also encountered this problem.

Is it possible that this is because I am a GTX 1650 4G graphics card, because MLC currently does not support the CPU, and then the PC's graphics card is not enough memory?

junrushao commented 11 months ago

It’s not about GPUs, but more like an OS issue. @Tao-begd @MaTwickenham Are you both on windows?

Could you run the following command and share the error message: python -m mlc_chat.cli.check_device vulkan:0

python -m mlc_chat.cli.check_device vulkan:0,the result is 1

It seems to actually work…Did it crash or any error message popped up?

MaTwickenham commented 11 months ago

It’s not about GPUs, but more like an OS issue. @Tao-begd @MaTwickenham Are you both on windows?

Could you run the following command and share the error message: python -m mlc_chat.cli.check_device vulkan:0

python -m mlc_chat.cli.check_device vulkan:0,the result is 1

It seems to actually work…Did it crash or any error message popped up?

Yes, I encountered that error on Windows. And my error message is just like @Tao-begd posted.

tqchen commented 11 months ago

Seems that the subprocess checking is a bit unreliable in windows. Maybe we can return to vulkan().exist check in the same process ?

tqchen commented 11 months ago

https://github.com/mlc-ai/mlc-llm/pull/1333 may address this issue

tao-began commented 11 months ago

Thanks to Professor Tianqi, the problem is solved

tqchen commented 11 months ago

Thank you for reporting!

junrushao commented 11 months ago

I'm able to reproduce this issue:

using:

from mlc_chat.support.auto_device import detect_device
print(detect_device("auto"))

Removing "vulkan" in AUTO_DETECT_DEVICES avoids the sub-process from crashing, which means it's a TVM-specific bug with its Vulkan runtime @tqchen

junrushao commented 11 months ago

In fact, directly runs the command below actually does fail, but cmd.exe just hides this problem silently:

junrushao commented 11 months ago

https://github.com/mlc-ai/mlc-llm/pull/1350

mlc-ai / mlc-llm