ggerganov / llama.cpp

LLM inference in C/C++
MIT License
68.48k stars 9.83k forks source link

MLX convert to gguf #10551

Open AustinScola opened 1 day ago

AustinScola commented 1 day ago

Name and Version

version: 4179 (25669aa9) built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.4.0

Operating systems

Mac

Which llama.cpp modules do you know to be affected?

No response

Problem description & steps to reproduce

I'm trying to convert a lora adapter created with MLX to GGML using convert_lora_to_gguf.py but I'm running into a problem.

First Bad Commit

No response

Relevant log output

INFO:lora-to-gguf:Loading base model: llama-3.2-transformers-3b-instruct-v1
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:lora-to-gguf:Exporting model...
ERROR:lora-to-gguf:Unexpected name 'model.layers.0.self_attn.q_proj.lora_a': Not a lora_A or lora_B tensor
AustinScola commented 1 day ago

I think I may have found a solution with https://github.com/ml-explore/mlx/discussions/1507#discussioncomment-11039570 for converting from MLX to PEFT.

ngxson commented 1 day ago

I've never tried an MLX adapter. Could you please provide us one for testing?