Convert adds an additional token (= token missmatch to the base model)

ml-explore / mlx-examples

Examples in the MLX framework

MIT License

5.74k stars 820 forks source link

When I run mlx_lm.convert for berkeley-nest/Starling-LM-7B-alpha my mlx model suddenly has 32003 instead of 32002 tokens. This creates issues if you want to train and later export a .gguf file via llama.cpp

python -m mlx_lm.convert \
--hf-path berkeley-nest/Starling-LM-7B-alpha \
--mlx-path /Volumes/T9/mlx_models/starling-lm7b-alpha-8bit \
-q \
--q-group-size 64 \
--q-bits 8 \
--dtype float16

Original models added_token.json (https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha/blob/main/added_tokens.json)

{
  "<|end_of_turn|>": 32000,
  "<|pad_0|>": 32001
}

added_token.json after converting it to mlx

{
  "<sep>": 32002,
  "<|end_of_turn|>": 32000,
  "<|pad_0|>": 32001
}

ml-explore / mlx-examples

Convert adds an additional token (= token missmatch to the base model) #542