aiidateam / aiida-core

The official repository for the AiiDA code
https://aiida-core.readthedocs.io
Other
436 stars 189 forks source link

Try and kill a `Process` when running in local interpreter gets interrupted #2711

Closed sphuber closed 2 weeks ago

sphuber commented 5 years ago

Currently, if a user runs a Process in a local interpreter and then interrupts the interpreter the process will be lost. However, the node will reflect the last active state. This can be confusing to new users, who will still for example see a "process" in the "Waiting" state. We should try and catch the interrupt signal and try to kill the process properly before shutting down. At the very least we should try to set an Excepted state on the node.

marco-foscato commented 5 years ago

Reporting experience upon suggestion from @giovannipizzi Considering tutorial 6.5. I have run the python script (i.e., verdi run scriptname) that included the run(EquationOfState,...) call and then I "killed" it (ctrl-C followed by ctrl-Z). Done this twice, on two independent runs. The result on the process list was the following in one case

  PK  Created    Process label    Process State    Process status
----  ---------  ---------------  ---------------  ------------------------------------
2564  16m ago    EquationOfState  ⏵ Waiting
2572  16m ago    PwCalculation    ⏵ Waiting        Waiting for transport task: update
2578  16m ago    PwCalculation    ⏵ Waiting        Waiting for transport task: update
2584  16m ago    PwCalculation    ⏵ Waiting        Waiting for transport task: update
2590  16m ago    PwCalculation    ⏵ Waiting        Waiting for transport task: retrieve
2596  16m ago    PwCalculation    ⏵ Waiting        Waiting for transport task: update

and in the other case, when I let the master script run for a shorter time than before

2604  4m ago     EquationOfState  ⏵ Running
2612  4m ago     PwCalculation    ⏹ Created
2618  4m ago     PwCalculation    ⏹ Created
unkcpz commented 2 weeks ago

In my test, the subprocess will be killed and moved to the Killed state. The parent process will then be Except process. It is matter of time to wait for event loop to gracefully run kill on all subprocess I think. I close this one, and feel free to reopen if the problem again can be reproduce.