ggerganov / ggml

Tensor library for machine learning
MIT License
11.26k stars 1.05k forks source link

gguf.md: naming convention synced to llama.cpp #896

Closed mofosyne closed 4 months ago

mofosyne commented 4 months ago

https://github.com/ggerganov/llama.cpp/pull/7499 was merged in, thus this PR is to sync to this new form

<BaseName><SizeLabel><FineTune><Version><Encoding><Type><Shard>.gguf

I also updated the validation regexp you can use

^(?<BaseName>[A-Za-z0-9\s]*(?:(?:-(?:(?:[A-Za-z\s][A-Za-z0-9\s]*)|(?:[0-9\s]*)))*))-(?:(?P<SizeLabel>(?:\d+x)?(?:\d+\.)?\d+[A-Za-z](?:-[A-Za-z]+(\d+\.)?\d+[A-Za-z]+)?)(?:-(?P<FineTune>[A-Za-z0-9\s-]+))?)?-(?:(?P<Version>v\d+(?:\.\d+)*))(?:-(?<Encoding>(?!LoRA|vocab)[\w_]+))?(?:-(?<Type>LoRA|vocab))?(?:-(?P<Shard>\d{5}-of-\d{5}))?\.gguf$

You can check how it works in https://regex101.com/r/7DgTVN/1

mofosyne commented 4 months ago

for vocab files... does a size label even make sense?