Closed kalradivyanshu closed 5 months ago
Could you please try upgrading or reinstalling your wheel to dev1287
?
I'm not sure why the cu121 wheel was not upgraded to the latest version as the cu122 did. But now it should work
i had the same thing with dev1287.
i had to recompile my model libs with mlc_llm compile
and then it works fine.
@0xDEADFED5 can you tell me what exact step you took to resolve it?
@Hzfengsy same issue as @0xDEADFED5 reported. I installed https://github.com/mlc-ai/package/releases/download/v0.9.dev0/mlc_llm_nightly_cu121-0.1.dev1287-cp311-cp311-manylinux_2_28_x86_64.whl directly.
@0xDEADFED5 can you tell me what exact step you took to resolve it?
mlc_llm compile -h
will show you everything you need to know.
this assumes you already have a quantized model with a config file.
here's the syntax from my batch file (make sure to activate your venv first):
(for linux/mac change backslash to forward slash and .dll to .so)
set src=C:\LLM\SFR-Iterative-DPO-LLaMA-3-8B-R
set quant=q4f16_1
set dst=%src%-MLC-%quant%
set model=auto
set device=auto
mlc_llm compile --quantization %quant% --model-type %model% --device %device% -o %dst%\None-vulkan.dll %dst%\mlc-chat-config.json
then when you run mlc_llm chat
or mlc_llm serve
you have to specify the lib you just compiled.
here's my command-line for an example:
mlc_llm serve --model-lib C:\LLM\SFR-Iterative-DPO-LLaMA-3-8B-R-MLC-q4f16_1\None-vulkan.dll --mode server --speculative-mode disable C:\LLM\SFR-Iterative-DPO-LLaMA-3-8B-R-MLC-q4f16_1
I can confirm as @0xDEADFED5 said, compiling model from scratch works and the error doesn't appear.
@kalradivyanshu thanks for confirming! Glad that works :-)
🐛 Bug
I ran
mlc_llm chat HF://mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC
and it failed withValueError: Cannot find global var "multinomial_from_uniform1" in the Module
To Reproduce
Steps to reproduce the behavior:
+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+
$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Mon_Apr__3_17:16:06_PDT_2023 Cuda compilation tools, release 12.1, V12.1.105 Build cuda_12.1.r12.1/compiler.32688072_0