Example of batch/throttle of large lists of inputs

huggingface / hfapi

Simple Python client for the Hugging Face Inference API

https://huggingface.co/models

MIT License

72 stars 10 forks source link

Closed julien-c closed 4 years ago

julien-c commented 4 years ago

This is for a customer to the paid version of the Inference API, who is looking to send bulk volumes of sentences (e.g 1,000), in short spikes.

The API works with batches out of the box but obviously not batches of arbitrarily large size :)

We also don't want them to send all requests in parallel, so this is an example of how they can sequentially batch their documents

julien-c commented 4 years ago

we don't necessarily have to merge this if it's too localized