Closed dsikka closed 11 months ago
Re: The assertion error I hit earlier... I ran 10 prompts 700 times w/ max_workers=1
and I never hit the assertion error. So the related bug has definitely got something to do with concurrency, but I can't offer any more insight into whether it is an issue with the engine or with this code when run with concurrency.
since the way we're scheduling operators has changed, need to reassess the async functionality
This PR has been updated to use the new operator scheduling with the run async function.
Summary
run_async
function to be used by the deepsparse server. This introduces anasyncio
loop to execute each operator and allows us to await operation completion, such that multiple requests can be accepted without blockingrun_async
can handle multiple prompts/works with split/joinTesting
The following script makes multiple calls (with different number of prompts) using the
run_async
functionOutput: