ninia / jep

Embed Python in Java
Other
1.3k stars 147 forks source link

jep with pytorch happens errors #399

Closed pyNpy closed 2 years ago

pyNpy commented 2 years ago

Describe the problem

hello sir, in my project , i use jep to call deep-mechine-learning python3 code ,to classify the input text of content , i run the following code with 1000 times .

in normal case , the console will print string strings ,like "Evaluating: 100%|████████████████████████████| 1/1 [00:00<00:00, 1.41it/s]"

But after loop serval times (we can not make sure the exact number , may 4 or 5 and others number ), the program meet some problems , the console print strings like this : "Evaluating: 0%| | 0/1 [00:00<?, ?it/s] "

I follow the python script and step into torch.nn.module.eval , and i make sure that the python code step in to torch.nn.module.eval, as i print the strings as flag strings >>>>>>>>>torch.nn.module.eval , as the following picture image

About the code

the loop 1000 times : java call python

image

other java code :

image

Questions

  1. Having you ever seen such problems like it ?

  2. I think maybe some problems happens in pytorch , but hava no more idea to following the code , because the python code torch.nn.module.eval is the final called in python . Maybe i need seen the code near progress bar

  3. Is there any possibility of problems happen in jep ? for somewhere , i see the description as python gloable static variable data can be influenced by multi threads ? but the java code which be calling python code run one single process and one thread , i have no idea about it ?

Last words i can think about the questions as above , but i have not seen the errors as before , can you give me some idea ?thanks a lot .

Environment (please complete the following information):

bsteffensmeier commented 2 years ago

Unfortunately your problem does not point to any specific issue we are aware of. I have no solution but I ahe a few suggestions for things to try to narrow down the problem.

  1. Try running the same scenerio in python without jep, including looping 1000 times and doing the same operation. If the problem is specific to pyorch and completely unrelated to jep this would fail and clearly indicate jep is not the source of problem.
  2. Try creating only one shared interpreter and looping 1000 times within a single interpreter and doing the same task. If this is successful it would indicate there may be a problem related to the way jep cleans up the state when an interpreter closes or creates new state when a new interpreter is open.
pyNpy commented 2 years ago

yes,i will try the idea which you hava told ,and try more test , thanks