aksnzhy / xlearn

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
https://xlearn-doc.readthedocs.io/en/latest/index.html
Apache License 2.0
3.09k stars 519 forks source link

xlearn python API's .predict method in doesn't kill the created threads after execution in python API, which leads to resource exhausted. #363

Open HovhannesManushyan opened 3 years ago

HovhannesManushyan commented 3 years ago

I was getting strange resource exhausted bug when running xlearn fm model predict method for a while.

When I profiled the processes via htop, I have noticed that the number of threads gradually increases by 8 when invoking model.predict("model/model.out", f"output/output.txt") which leads to resource exhausted when the number of threads reaches a critical level.

One solution, I found to solve this problem is invoke the model.predict in a separate process via the multiprocessing module, however this solution is extremely slow in cases when model.predict needs to be invoked many times.

Is there a way to kill the created threads after the execution of the predict method has completed?

HovhannesManushyan commented 3 years ago

This problem could be solved by building the command line xlearn and then executing the binary using Python's subprocess module.

litchi6666 commented 3 years ago

we have the same problem!