Closed amevec closed 10 months ago
Hi @amevec, I have updated the dustynv/mlc:dev
container to track the main branch of mlc_llm repo (see commit https://github.com/dusty-nv/jetson-containers/commit/d89fee0dbc21b9496d6067de776c0fc1c3224147)
So try using dustynv/mlc:dev
instead if you want the latest updates in MLC. Whereas dustynv/mlc
is reserved for a stable/tested version (actually the MLC project is unversioned, which is why I'm doing it here by commit SHA). I am seeing a 10-15% perf regression between 10/20/2023 (mlc_llm sha 9bf5723
) and now.
Hardware: Orin AGX/NX Software: dustynv/mlc:r35.4.1 Issue Summary: When using the mlc image for compiling LLMs, the model compilation is failing with vicuna-7b-v1.5.
Resolution: Use updated mlc-llm main branch or apply commit manually to /usr/local/lib/python3.8/dist-packages/mlc_llm/relax_model/param_manager.py MLC-LLM Pull Request: https://github.com/mlc-ai/mlc-llm/pull/917 MLC-LLM Commit for specific fix (to hotfix existing container): https://github.com/mlc-ai/mlc-llm/pull/917/commits/4a0e7a912a7085eb7cd166d5d8b584e1b5ed3947
Reproduce:
The compute/quantize step fails with the following error: