a-ghorbani / pocketpal-ai

An app that brings language models directly to your phone.
MIT License
1.06k stars 75 forks source link

[Feat]: Add HuggingFace Integration #83

Open a-ghorbani opened 1 week ago

a-ghorbani commented 1 week ago

Description Currently, if a link to a gguf file is not already in the app's default list of models, the user must manually navigate to Hugging Face, search for the model, download it on their phone, and add it as a local model. This process is cumbersome and impacts the user experience. By integrating model search within the app, users would be able to search for models directly, choose the one they need, and add it without the need for separate downloads.

Things to Consider

Use Case A user wants to add a gguf model from Hugging Face. By implementing integrated search and add functionality, the user can browse Hugging Face models, select their desired model, and add it directly within the app, streamlining the process significantly.

Tasks

julien-c commented 1 week ago
  • Chat Templates and Default Settings: Need a method to detect/assign appropriate chat templates and default settings models from Hugging Face.

I think most GGUF files on the HF Hub now include their chat templates, but cc @Vaibhavs10 @ngxson for confirmation

We also have APIs for file sizes and number of params – and if you need other APIs let us know!

a-ghorbani commented 1 week ago

We also have APIs for file sizes and number of params – and if you need other APIs let us know!

This sounds great!

At the moment, I’ve built a quick MVP using /api/models. I'm impressed—the search is very fast! :)

Here's the setup I’m using for search: Using GET /api/models with the following params:

Fields I'm inferring (assuming response is one of the items return from api/models):

What I still need help with:

Initial prototype:

Simulator Screenshot - iPhone 15 Pro Max - 2024-11-04 at 19 09 18

Vaibhavs10 commented 1 week ago

Hi @a-ghorbani - Nice to meet ya! Big fan of your work! What you detail is correct. If you want more fine-grained look over the API head over here: https://huggingface.co/spaces/enzostvs/hub-api-playground to play with all the params and response for the Search and Repo APIs.

This would give you the exact response that the hub returns along with the schema. 🤗

a-ghorbani commented 1 week ago

Nice to meet you too, @Vaibhavs10 👋

For the list of files, I'm currently seeing outputs like this:

"siblings": [
      {
        "rfilename": "smollm2-1.7b-instruct-q4_k_m.gguf"
      }
    ]

I was hoping to get additional information, like file size, so it would look more like this:

"siblings": [
      {
        "rfilename": "smollm2-1.7b-instruct-q4_k_m.gguf", "size": "2gb"
      }
    ]

Is it possible to get something like this at the moment? otherwise, happy to open a feature request.

ngxson commented 1 week ago

Hi @a-ghorbani , nice to meet you :D

You can get more information using this endpoint:

const modelInfoUrl = `https://huggingface.co/api/models/${modelId}/tree/main?recursive=true`

For example: https://huggingface.co/api/models/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main?recursive=true

For the download URL, you don't need ?download=true if you're using something like fetch. This query params is only used if you want to open the URL in browser, it will tell the browser to download the file.

const blobUrl = `https://huggingface.co/${modelId}/resolve/main/${rfilename}`

Pay attention that rfilename maybe a sub-directory, for example rfilename = 'folder/file.gguf'. You may need to clean up the file name in this case.

And lastly, some models has multiple parts, for example this repo. You can use this regex and the parser code above it to detect such case (or maybe you can just leave out support for multiple parts for now, and add it in the future whenever you like ;-) )

ngxson commented 1 week ago

(To correct what I said earlier), you need to add ?recursive=true in order to show files nested inside sub directories. For example: https://huggingface.co/api/models/bartowski/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF/tree/main?recursive=true

ngxson commented 1 week ago

And btw this endpoint with ?expand[]=gguf can give you some basic info about the model, like number of parameters total, chat_template (models not having chat_template is not usable for chat)

https://huggingface.co/api/models/bartowski/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF?expand[]=gguf

a-ghorbani commented 1 week ago

Hey @ngxson, thanks! These are super helpful.

a-ghorbani commented 4 days ago

I've nearly finished the first version. I'll run a couple of tests and, if everything checks out, I plan to ship it in a few days.

https://github.com/user-attachments/assets/87bf4955-bb95-4922-b80e-f4307f1398e2

https://github.com/user-attachments/assets/3f6db7a1-f390-41bd-ae7c-0c80abf95b31

Thanks @julien-c @ngxson @Vaibhavs10 for the hints and tips—this would've taken much longer to integrate without your help!

ps: smollm2 135M - Only 88 MB and 127 t/s 🤯

ngxson commented 2 days ago

Wow that's amazing. Good job @a-ghorbani ! Thanks for taking time to implement this!