Closed raj-khare closed 5 months ago
@raj-khare I've not tested Mixtral via MLC yet - you might wanna file this issue against the upstream mlc_llm github as that is probably where it would end up going anyways 👍
@raj-khare looks like it was fixed getting Mixtral to load: https://github.com/mlc-ai/mlc-llm/issues/1752#issuecomment-1950809882
It should be in dustynv/mlc:c30348a-r36.2.0
which is a commit newer than https://github.com/mlc-ai/mlc-llm/commit/bf05dfc4b428c0d8c86726b5136498ebea2882e9
Mind you, I am currently rebuilding/retesting again to pick up https://github.com/mlc-ai/mlc-llm/commit/a2d9eea1b7025b8174ebb7913dcf878bd8d13f13
yep! it works thanks :)
I'm trying run to Mixtral 8-7B model on Jetson AGX (aarch64, sm_87). But getting the following error:
My chat config
To Reproduce
Steps to reproduce the behavior:
I have compiled MLC LLM with the following FLAGS:
Expected behavior
Model should run without any issue.
Environment
Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): CUDA aarch64, sm_87 Operating system (e.g. Ubuntu/Windows/MacOS/...): Ubuntu Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): Nvidia Jetson AGX Orin 64GB How you installed MLC-LLM (conda, source): Docker How you installed TVM-Unity (pip, source): Docker Python version (e.g. 3.10): 3.10.12 GPU driver version (if applicable): none CUDA/cuDNN version (if applicable): cuda_12.2.r12.2/compiler.33191640_0 TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
Any help is highly appreciated!