deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.07k stars 648 forks source link

What's the solutions of concurrency for AI model inference in DJL? #2836

Open SidneyLann opened 10 months ago

SidneyLann commented 10 months ago

Description

What's the solutions of concurrency for AI model inference in DJL? Multithreads can access a model in the same time? Support Nvidia Triton?

Will this change the current api? How?

Who will benefit from this enhancement?

References

frankfliu commented 10 months ago

DJL is a low level library. We have DJLServing as a model server which is designed as a general inference platform. And we do support running tritoncore inside DJLServing. Please take a look: https://docs.djl.ai/master/docs/serving/index.html