Open sleepingcat4 opened 2 months ago
I am working on the documentation. Here is the Setup.md file.
I’ve set up the GitHub Pages site for the documentation. You can access it here. Please check if the documentation is useful and let me know if there are any issues or improvements needed.
@kevaldekivadiya2415 the documentation is too simple and does not talk about workers and batches. Besides batches are tricky when you are sending huge payload to the model and followed by 3/4 more operations (small yet uses workers).
Then I would love if it did a benckmark between TEI library from HF and if it had Intel Gaudi hardware support.
--workers: This argument specifies the number of worker processes to be used for batch processing. Increasing the number of workers allows the system to handle multiple batches in parallel, improving throughput, especially when processing a high volume of requests.
--batch_size: This parameter defines the size of each batch for processing requests.
@kevaldekivadiya2415 can you provide a benchmark on 5000 rows using batch process? I tried batch processing before with TEI Gaudi HF repo and it was a disaster. It was easier to do sequentially since I could reach 8000 rows in 27 seconds.
I have super computers and limited amounts of smalls sized computers. If you could give me a ballpark what's the throughput at a given time for an 1 page worth payload and storing the results on RAM, I would be eager to checkout your library.
I can't find the documentation and Setup.md files returns a 404 error. is there docs available?