TimefoldAI / timefold-solver-python

Timefold Solver is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference Scheduling, Job Shop Scheduling, Bin Packing and many more planning problems.
https://timefold.ai
Apache License 2.0
36 stars 3 forks source link

fix: Use daemon threads for SolverManager #69

Closed Christopher-Chianelli closed 3 months ago

Christopher-Chianelli commented 3 months ago

Before, if a SolverManager creates a Solver, it creates a non-daemon thread, which can prevent the Python process from EVER exiting unless forced.

Now, the SolverManager spawns only daemon threads, which allows the Python process to exit. This also allows us to remove some test configuration code that was used to force the JVM to exit.

Additionally, SolverManager can take SolverConfig directly and may take a SolutionManagerConfig too.

Tests were flaky before, since the problem fact change may affect the input problem if the change was applied before solving was started.

zepfred commented 3 months ago

More importantly - what does the Java solver do? Does it also create deamon threads? If not, then for consistency, neither should the Python solver.

I understand the Python SolverManager::create now configures the thread factory class with DaemonThreadFactory. The new thread factory only creates daemon threads, and this would apply to the Java side.

        java_solver_manager_config = solver_manager_config._to_java_solver_manager_config()  # noqa
        java_solver_manager_config.setThreadFactoryClass(DaemonThreadFactory.class_)
triceo commented 3 months ago

Yes, but I'm asking something else. Forget Python Solver. What's the default behavior of the Java solver, without any fancy configuration? My argument is that the Python solver's default behavior should be exactly the same.

(In other words: if the Java solver by default doesn't do daemon threads, neither should the Python solver.)

Christopher-Chianelli commented 3 months ago

I'm not entirely sure about the daemon threads.

Why should the process terminate if the solver jobs haven't? The jobs are important payload. To me, this doesn't sound like the least surprising behavior.

More importantly - what does the Java solver do? Does it also create deamon threads? If not, then for consistency, neither should the Python solver.

If you control-C the Java solver and don't configure a signal handler, Java terminates without daemon threads. If you control-C the Python solver, even AFTER solving has finished, and deamon threads are not used, it hangs forever, since Python is waiting for Java to shutdown, but Python holds a reference to SolverManager (and thus its threadpool, which keep threads alive until it is garbage collected, never shutdown).

triceo commented 3 months ago

If you control-C the Python solver, even AFTER solving has finished, and deamon threads are not used, it hangs forever, since Python is waiting for Java to shutdown, but Python holds a reference to SolverManager (and thus its threadpool, which keep threads alive until it is garbage collected, never shutdown).

Which is a problem that should be solved by properly shutting down the executor when no longer needed, no?

Christopher-Chianelli commented 3 months ago

If you control-C the Python solver, even AFTER solving has finished, and deamon threads are not used, it hangs forever, since Python is waiting for Java to shutdown, but Python holds a reference to SolverManager (and thus its threadpool, which keep threads alive until it is garbage collected, never shutdown).

Which is a problem that should be solved by properly shutting down the executor when no longer needed, no?

Nope; that was done in tests, and tests also hang and needed specially handling code

triceo commented 3 months ago

In my opinion, the key here is to figure out why does Java terminate in one case, and not in the other. Solving that through daemon threads is IMOM a hack, and not a solution - there will be a root cause there somewhere.

Christopher-Chianelli commented 3 months ago

In my opinion, the key here is to figure out why does Java terminate in one case, and not in the other. Solving that through daemon threads is IMOM a hack, and not a solution - there will be a root cause there somewhere.

Because Python hold a reference and a ThreadPoolExecutor keeps it threads alive once spawn, creating a catch-22:

triceo commented 3 months ago

And Python cannot shut down the executor and drop the reference? In Java, you'd set the field to null and problem solved.

Christopher-Chianelli commented 3 months ago

And Python cannot shut down the executor and drop the reference? In Java, you'd set the field to null and problem solved.

Nope; consider user experience; "Why isn't my program terminating? Why is my test hanging?" That is terrible UX. This is the code users WANT to write

solver_manger = SolverManager.create(solver_config)

@app.get("/root/solve")
async def solve(timetable: Timetable) :
    solver_manager.solve('ID', timetable, ...)

And users do not want that code to hang on Control-C

triceo commented 3 months ago

I fully agree with that. It just seems logical that a framework that does this (see @app.get) also has some kind of a destructor method that we can hook up into to properly free resources.

Daemon threads are a solution to the problem which we may need to accept, but the real solution would be to somehow free the resources we've taken and allow a clean shutdown on its own.

Christopher-Chianelli commented 3 months ago

I fully agree with that. It just seems logical that a framework that does this (see @app.get) also has some kind of a destructor method that we can hook up into to properly free resources.

Daemon threads are a solution to the problem which we may need to accept, but the real solution would be to somehow free the resources we've taken and allow a clean shutdown on its own.

That the responsibility of JPype, which adds its own shutdown hook to Python, which hangs forever w/o daemon threads.

sonarcloud[bot] commented 3 months ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud