[Feat]: Add HuggingFace Integration

a-ghorbani commented 1 week ago

Description Currently, if a link to a gguf file is not already in the app's default list of models, the user must manually navigate to Hugging Face, search for the model, download it on their phone, and add it as a local model. This process is cumbersome and impacts the user experience. By integrating model search within the app, users would be able to search for models directly, choose the one they need, and add it without the need for separate downloads.

Things to Consider

File Size Check: Determine if the app can verify whether the model file will fit on the device before downloading.
Model Information: The current UI displays model size and parameters. We need to identify a way to retrieve this data or adjust the UI to handle cases where this information isn't available (perhaps the same approach used for local models?).
Chat Templates and Default Settings: Need a method to detect/assign appropriate chat templates and default settings models from Hugging Face.

Use Case A user wants to add a gguf model from Hugging Face. By implementing integrated search and add functionality, the user can browse Hugging Face models, select their desired model, and add it directly within the app, streamlining the process significantly.

Tasks

[x] Add Search Bottom Sheet
- [x] Search Bar
- [x] Display List of Repositories with GGUFs
- [x] Each repository entry should be touchable and open a model details bottom sheet.
- [ ] add pagination
[x] Add Model Details Sheet
- [x] GGUFs List
- [x] Handle cases where siblings are in sub-directories.
- [x] Handle sharded files. Sharded files can be detected using a method similar to this - They are currently filtered out. If needed, we can revisit this to add support for sharded files.
- [x] Download Action for GGUF files
- Adds the file to the models list and downloads it.
- [x] Bookmark Functionality for GGUF files
- Adds the file to the models list without downloading it.
- [x] Display "Updated ... Ago" Timestamp
- [x] Display Number of Likes
- [x] Display Number of Downloads
- [x] Display File Size
- Retrieve file sizes via: https://huggingface.co/api/models/${modelId}/tree/main?recursive=true (include ?recursive=true to show files nested within sub-directories).
- [x] If the file size exceeds phone capacity, disable download but allow bookmarking.
[ ] Remove ?download=true Parameter
- see #83
[x] Clean up FAB buttons
[ ] test on Android, iOS, and various phone sizes.
[ ] Explore removing chat-formatter and moving to chat-template all together?
- [ ] Needs dealing with those not supporting the system prompt.

julien-c commented 1 week ago

Chat Templates and Default Settings: Need a method to detect/assign appropriate chat templates and default settings models from Hugging Face.

I think most GGUF files on the HF Hub now include their chat templates, but cc @Vaibhavs10 @ngxson for confirmation

We also have APIs for file sizes and number of params – and if you need other APIs let us know!

a-ghorbani commented 1 week ago

We also have APIs for file sizes and number of params – and if you need other APIs let us know!

This sounds great!

At the moment, I’ve built a quick MVP using /api/models. I'm impressed—the search is very fast! :)

Here's the setup I’m using for search: Using GET /api/models with the following params:

search: This is for the user search query, typically something like smollm2.
filter: The app sets this to gguf, assuming any repository containing GGUF files will have this tag.
full: Set to true to retrieve file names, like {"rfilename": "smollm2-1.7b-instruct-q4_k_m.gguf"}, which allows us to use these names for downloads.

Fields I'm inferring (assuming response is one of the items return from api/models):

Repository URL: https://huggingface.co/ + response.id.
Download file URL: https://huggingface.co/ + response.id + /resolve/main/ + response.siblings[i] + '?download=true'

@Vaibhavs10 @ngxson, does this look correct to you??

What I still need help with:

GGUF file size and parameter counts. These aren't critical for the app to work - I think I can do some workaround for it to work, but including them would be great for the user experience.

Initial prototype:

Simulator Screenshot - iPhone 15 Pro Max - 2024-11-04 at 19 09 18

Vaibhavs10 commented 1 week ago

Hi @a-ghorbani - Nice to meet ya! Big fan of your work! What you detail is correct. If you want more fine-grained look over the API head over here: https://huggingface.co/spaces/enzostvs/hub-api-playground to play with all the params and response for the Search and Repo APIs.

This would give you the exact response that the hub returns along with the schema. 🤗

a-ghorbani commented 1 week ago

Nice to meet you too, @Vaibhavs10 👋

For the list of files, I'm currently seeing outputs like this:

"siblings": [
      {
        "rfilename": "smollm2-1.7b-instruct-q4_k_m.gguf"
      }
    ]

I was hoping to get additional information, like file size, so it would look more like this:

"siblings": [
      {
        "rfilename": "smollm2-1.7b-instruct-q4_k_m.gguf", "size": "2gb"
      }
    ]

Is it possible to get something like this at the moment? otherwise, happy to open a feature request.

ngxson commented 1 week ago

Hi @a-ghorbani , nice to meet you :D

You can get more information using this endpoint:

const modelInfoUrl = `https://huggingface.co/api/models/${modelId}/tree/main?recursive=true`

For example: https://huggingface.co/api/models/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main?recursive=true

For the download URL, you don't need ?download=true if you're using something like fetch. This query params is only used if you want to open the URL in browser, it will tell the browser to download the file.

const blobUrl = `https://huggingface.co/${modelId}/resolve/main/${rfilename}`

Pay attention that rfilename maybe a sub-directory, for example rfilename = 'folder/file.gguf'. You may need to clean up the file name in this case.

And lastly, some models has multiple parts, for example this repo. You can use this regex and the parser code above it to detect such case (or maybe you can just leave out support for multiple parts for now, and add it in the future whenever you like ;-) )

ngxson commented 1 week ago

(To correct what I said earlier), you need to add ?recursive=true in order to show files nested inside sub directories. For example: https://huggingface.co/api/models/bartowski/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF/tree/main?recursive=true

ngxson commented 1 week ago

And btw this endpoint with ?expand[]=gguf can give you some basic info about the model, like number of parameters total, chat_template (models not having chat_template is not usable for chat)

https://huggingface.co/api/models/bartowski/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF?expand[]=gguf

a-ghorbani commented 1 week ago

Hey @ngxson, thanks! These are super helpful.

a-ghorbani commented 4 days ago

I've nearly finished the first version. I'll run a couple of tests and, if everything checks out, I plan to ship it in a few days.

https://github.com/user-attachments/assets/87bf4955-bb95-4922-b80e-f4307f1398e2

https://github.com/user-attachments/assets/3f6db7a1-f390-41bd-ae7c-0c80abf95b31

Thanks @julien-c @ngxson @Vaibhavs10 for the hints and tips—this would've taken much longer to integrate without your help!

ps: smollm2 135M - Only 88 MB and 127 t/s 🤯

ngxson commented 2 days ago

Wow that's amazing. Good job @a-ghorbani ! Thanks for taking time to implement this!

a-ghorbani / pocketpal-ai

[Feat]: Add HuggingFace Integration #83

Tasks

Initial prototype: