mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
19.08k stars 1.56k forks source link

[Bug] M2 Build for any model - ValueError: Multiple weight shard files without json map is not supported #1067

Closed giovannizinzi closed 1 year ago

giovannizinzi commented 1 year ago

🐛 Bug

Trying to use rwkv-raven-1b5. It doesn't seem to load the model from HF, but instead throws an error at ("Multiple weight shard files without json map is not supported"). Any advice on how to troubleshoot?

Mac Sonoma 14.0 M2 Chip, installed TVM and mlc in a conda environment Followed instructions at https://llm.mlc.ai/docs/compilation/compile_models.html

To Reproduce

python3 -m mlc_llm.build --hf-path=RWKV/rwkv-raven-1b5 --target metal --quantization q4f16_2

Weights exist at dist/models/rwkv-raven-1b5, skipping download. Using path "dist/models/rwkv-raven-1b5" for model "rwkv-raven-1b5" Host CPU dection: Target triple: arm64-apple-darwin23.0.0 Process triple: arm64-apple-darwin23.0.0 Host CPU: apple-m1 Target configured: metal -keys=metal,gpu -max_function_args=31 -max_num_threads=256 -max_shared_memory_per_block=32768 -max_threads_per_block=1024 -thread_warp_size=32 Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/Users/giovannizinzi/Documents/localModel/mlc-llm/mlc_llm/build.py", line 46, in main() File "/Users/giovannizinzi/Documents/localModel/mlc-llm/mlc_llm/build.py", line 42, in main core.build_model_from_args(parsed_args) File "/Users/giovannizinzi/Documents/localModel/mlc-llm/mlc_llm/core.py", line 643, in build_model_from_args param_manager.init_torch_pname_to_bin_name(args.use_safetensors) File "/Users/giovannizinzi/Documents/localModel/mlc-llm/mlc_llm/relax_model/param_manager.py", line 292, in init_torch_pname_to_bin_name mapping = load_torch_pname2binname_map( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/giovannizinzi/Documents/localModel/mlc-llm/mlc_llm/relax_model/param_manager.py", line 920, in load_torch_pname2binname_map raise ValueError("Multiple weight shard files without json map is not supported") ValueError: Multiple weight shard files without json map is not supported

Expected behavior

Would like the build to work.

Environment

junrushao commented 1 year ago

CC: @Hzfengsy

sunggg commented 1 year ago

Hi, @giovannizinzi. Would you check your model directory only contains safetensors? Without --use-safetensors, the build script would look for bin files and this might be the issue.

giovannizinzi commented 1 year ago

@sunggg interesting, the model directory I was trying to pull from in huggingface does not contain safe tensors (instead it's .bin shards): RWKVLink

1 point of feedback, I believe the following documentation is not up to date based on your comment:

image

I think I get the problem, so next going to try git lfs clone a model's weights locally to dist/models and then run the build commands, will report back if it works

giovannizinzi commented 1 year ago

Closing this issue. Thanks for the pointer @sunggg got it working.

I git lfs cloned the rwkv-raven-1b5 hf directory (https://huggingface.co/RWKV/rwkv-raven-1b5/tree/main) into dist/models

From there, I ran

python3 -m mlc_llm.build --model rwkv-raven-1b5 --target iphone --quantization q4f16_2

and everything worked as intended. Will just git clone things locally before compiling instead of using hf flag. Thx!