alibaba / pemja

Apache License 2.0
88 stars 24 forks source link

Integrate pemja in java web to call python deep learning model #13

Open ygean opened 2 years ago

ygean commented 2 years ago

Hi, in java web, is it possible to use pemja to call a deep learning model written in python to implement an AI service? There is such a scenario, there are two teams, one team is good at writing java web, the other team is good at using python to research machine translation models, we hope to use pemja to connect these two teams and deliver a robust and efficient model service , so that there is no need for a C++ development team or a python development team to develop web services. When the design of pemja, is it possible to support java calls to persistent python instances in RAM memory or GPU memory?

ygean commented 2 years ago

@HuangXingBo

HuangXingBo commented 2 years ago

Yes. This is a very typical way of using Pemja.

HuangXingBo commented 2 years ago

@zhouyuangan Have you started integration with pemja yet? Do you encounter any problems?

285220927 commented 2 years ago

Yes, this is a good project, but we encountered a problem: in a class in python, the init method needs to initialize the inference engine, which is a very time-consuming operation, but the execution of the inference function is fast, so In java, we use a static code block to initialize the python class to get a python object. This step will initialize the python inference engine, so in the java web, each request does not need to initialize the inference engine. But in python, the initialization function and the inference function must be called in the same thread, so in java we can only use "Executors.newSingleThreadExecutor()" to put the initialization method and inference method in the same thread to call python, may i ask is there a better way or optimization?

HuangXingBo commented 2 years ago

@285220927 Why not create multiple PythonInterpreters ? Multiple thread (each thread has a PythonInterpreter) or Multiple PythonInterpreter instances in one thread

285220927 commented 2 years ago

It takes a lot of GPU resources to initialize an inference engine, so we want to create only one PythonInterpreter, we are looking for a way to use pemja to call python concurrently within the same thread

HuangXingBo commented 2 years ago

@285220927 Do you mean that [a Java thread -> a Python Interpreter -> an inference engine -> multiple python threads ] ?

285220927 commented 2 years ago

@285220927 Do you mean that [a Java thread -> a Python Interpreter -> an inference engine -> multiple python threads ] ?

We want [multiple Java thread -> a Python Interpreter -> an inference engine -> a python threads ] We want python to always be inside the same thread no matter how many times it is called with pemja. We found by printing the python thread id: java calls within the same thread => python is always in a thread; java multi-threaded calls => python is not in the same thread