huggingface / huggingface.js

Utilities to use the Hugging Face Hub API
https://hf.co/docs/huggingface.js
MIT License
1.36k stars 201 forks source link

Document how to run local Inference, for the subset of models that support it #82

Open julien-c opened 1 year ago

julien-c commented 1 year ago

i.e. a how-to guide (or set of guides) on how to use TFJS or onnxruntime.js (or other alternatives) on either client or server JS

see those Twitter threads for instance:

flozi00 commented 1 year ago

https://github.com/xenova/transformers.js

Pipelines in js 🤗

carrycooldude commented 1 year ago

Can I take this @julien-c ?

visheratin commented 1 year ago

Web AI supports image and text models - https://github.com/visheratin/web-ai

Plus, it already integrates Hugging Face tokenizers for text models.

jasonmayes commented 1 year ago

Hello everyone. Jason here representing Google's Web ML teams including TensorFlow.js. First it is great to see so many options here to research further into - just a few years ago this was not so much the case so kudos to everyone involved in the Web ML space right now - great work!

I just wanted to weigh in with the TensorFlow.js perspective in case that is up for consideration too by Hugging Face and list some of our resources you can use to get started with conversion along with the current state of our ecosystem as to why we see folk actually caring about this. Hope it helps! Let's get to it...

Why though?

Here at Google we are seeing a number of industries really embrace the use of client side ML models (such as healthcare, fitness, HCI, design and retail/CPG). The main reasons for this are as follows:

  1. Privacy of sensor data such as webcams and mics - no data sent to the cloud for inference
  2. Lower latency for real time applications. Instead of tens of FPS we can get hundreds of FPS for quality models like body segmentation / pose estimation etc.
  3. Lower costs - its expensive to run everything server side. We are seeing more folk embrace hybrid or fully client side to reduce their costs. Companies like Roboflow push many models to run client side to save on server side costs of hosting such models for users to try out. They are probably the closest example we have right now to someone like HF who are also hosting models for folk to try. Roboflow have over 10K compatible Web ML models. See this interview for more detail.
  4. Zero install - many companies block the arbitrary execution of binaries
  5. The reach and scale of the web - anyone anywhere can click a link and it just works - no server side setup or maintenance (install TF/Pytorch, install CUDA, clone a GitHub, read the read me, install the deps, and pray it works - we have all been there). - instead share a link and it just works which is great for getting more eyes on research models that need feedback to find bugs and biases faster.

So how do we port models to TensorFlow.js?

Here are a number of resources to get started:

  1. I made a YouTube course to upskill folk in Web ML and in Chapter 6 I cover conversion on a high level at least. This covers how to export a Python model to a saved model to then use in our command line converter, or take an existing Keras or python saved model and run through our command line converter.

  2. If you prefer a written guide check this link instead on the TensorFlow.js docs

  3. Many models will convert using the methods listed above. However when those do not work the main issue you will run into for more complex models will be missing ops on the client side implementation of TensorFlow.js. In that case you have two options:

On that note we now have greater parity between our backends (backends being the technology we actually execute the model on - not a server side backend) such as WebGL, WebGPU, Web Assembly, and of course Plain old JS as the backup if all else fails. We are trying to get higher op parity between these backends so more models can run on any backend in the future too so good momentum there.

Do shout if any questions or if you find a specific blocker please also let me know so I can bring this up with our team to see if it is something we can unblock :-)

tqchen commented 1 year ago

Would love to come and share some of the recent projects we are doing https://github.com/mlc-ai/web-stable-diffusion

This project also takes a lot of benefit from taking HF models and bring them into web(Thanks to the broader open source community). We would love to support the community to do more. Love to see more interactions and support with the broader ecosystem.