Closed michaelfeil closed 4 months ago
@alpayariyak Ready for review / merge.
Incredible work, thank you so much @michaelfeil! Will review shortly
@alpayariyak Sorry for pinging, but it would great to merge this PR as is - and add any additional features if needed at a later point in time to not overload this PR
Hey @michaelfeil, lmk if there's anything you'd like to see before I cut an official release, but should be all good!
Docker for testing:
michaelf34/runpod-infinity-worker:0.0.4
I recently added multi-model deployment:
Adds:
HF_TOKEN
;
for conveniencesm>=89
.Something that could be useful:
.embed
adds it to this queue. To handle backpressure, maybe better reject the requests to be added, and give the runpod-serverless runtime the opportunity to retry, potentially hitting a new worker, or scaling to more workers.