ngxson / wllama

WebAssembly binding for llama.cpp - Enabling in-browser LLM inference
https://ngxson.github.io/wllama/examples/basic/
MIT License
231 stars 5 forks source link

Made a function to build the Model URL Array when detecting the url has the gguf-split pattern `-<number>-of-<number>.gguf`. Would it fit in the lib? #58

Closed felladrin closed 1 month ago

felladrin commented 1 month ago

Here's the function:

function parseModelUrl(url: string) {
  const urlPartsRegex = /(.*)-(\d{5})-of-(\d{5})\.gguf$/;

  const matches = url.match(urlPartsRegex);

  if (!matches || matches.length !== 4) return url;

  const baseURL = matches[1];

  const paddedShardsAmount = matches[3];

  const paddedShardNumbers = Array.from(
    { length: Number(paddedShardsAmount) },
    (_, i) => (i + 1).toString().padStart(5, "0"),
  );

  return paddedShardNumbers.map(
    (paddedShardNumber) =>
      `${baseURL}-${paddedShardNumber}-of-${paddedShardsAmount}.gguf`,
  );
}

Output examples

When a split GGUF URL is detected, it returns the array:

parseModelUrl("https://huggingface.co/Felladrin/gguf-sharded-Phi-3-mini-4k-instruct-iMat/resolve/main/phi-3-mini-4k-instruct-imat-Q5_K_M.shard-00001-of-00051.gguf");
// Outputs:
// [
//     "https://huggingface.co/Felladrin/gguf-sharded-Phi-3-mini-4k-instruct-iMat/resolve/main/phi-3-mini-4k-instruct-imat-Q5_K_M.shard-00001-of-00051.gguf",
//     "https://huggingface.co/Felladrin/gguf-sharded-Phi-3-mini-4k-instruct-iMat/resolve/main/phi-3-mini-4k-instruct-imat-Q5_K_M.shard-00002-of-00051.gguf",
//     (...)
//     "https://huggingface.co/Felladrin/gguf-sharded-Phi-3-mini-4k-instruct-iMat/resolve/main/phi-3-mini-4k-instruct-imat-Q5_K_M.shard-00050-of-00051.gguf",
//     "https://huggingface.co/Felladrin/gguf-sharded-Phi-3-mini-4k-instruct-iMat/resolve/main/phi-3-mini-4k-instruct-imat-Q5_K_M.shard-00051-of-00051.gguf"
// ]

Otherwise, it returns the URL unchanged:

parseModelUrl("https://huggingface.co/Felladrin/gguf-Pythia-31M-Chat-v1/resolve/main/Pythia-31M-Chat-v1.Q8_0.gguf")
// Outputs:
// 'https://huggingface.co/Felladrin/gguf-Pythia-31M-Chat-v1/resolve/main/Pythia-31M-Chat-v1.Q8_0.gguf'

Do you think it could be useful for the lib, to also accept a single URL of a split gguf in loadModelFromUrl?

[If the user sets an array, it wouldn't need to be parsed, as it's understandable that the user wants to use a custom list of ggufs. But as most will use the gguf-split's default naming, it could save some time.]

ngxson commented 1 month ago

Yes, I think we can possibly add this in near future.

Btw, gguf naming convention has recently been added: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#gguf-naming-convention

felladrin commented 1 month ago

Alright! Then I plan to do a PR soon. [If anyone else reading this wants to do it earlier, feel free to use/adapt the function I shared above!]