deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.16k stars 660 forks source link

What's the solutions of concurrency for AI model inference in DJL? #2836

Open SidneyLann opened 1 year ago

SidneyLann commented 1 year ago

Description

What's the solutions of concurrency for AI model inference in DJL? Multithreads can access a model in the same time? Support Nvidia Triton?

Will this change the current api? How?

Who will benefit from this enhancement?

References

frankfliu commented 1 year ago

DJL is a low level library. We have DJLServing as a model server which is designed as a general inference platform. And we do support running tritoncore inside DJLServing. Please take a look: https://docs.djl.ai/master/docs/serving/index.html