Currently the standard TorchTrainer class calls self.strategy.clean_up() at the end of execute(), but for certain use cases such as when profiling this can be problematic as you cannot access the strategy methods after this. Additionally, even though you have called clean_up(), multiple processes are still running, meaning that by calling clean_up() you're really just removing the control of the strategy while still having it run.
The solution to this would probably involve moving some of the logic of the strategy out of the TorchTrainer class or changing the functionality so that clean_up() kills the processes. Killing the processes could also be bad, though, as you might want to be able to run them after the train() function has finished (such as in the profiling case).
Currently the standard
TorchTrainer
class callsself.strategy.clean_up()
at the end ofexecute()
, but for certain use cases such as when profiling this can be problematic as you cannot access the strategy methods after this. Additionally, even though you have calledclean_up()
, multiple processes are still running, meaning that by callingclean_up()
you're really just removing the control of the strategy while still having it run.The solution to this would probably involve moving some of the logic of the strategy out of the
TorchTrainer
class or changing the functionality so thatclean_up()
kills the processes. Killing the processes could also be bad, though, as you might want to be able to run them after thetrain()
function has finished (such as in the profiling case).