Closed mdraw closed 6 years ago
Reference for getting relevant environment info: https://github.com/pytorch/pytorch/blob/d93d41b2ef0cc3a986aeb3db56935798beaa3f63/torch/utils/bottleneck/__main__.py#L88. This should be extended and added to our backup archives.
We should use this instead of the code referenced in my previous comment: https://github.com/pytorch/pytorch/pull/6635.
The code of the training script, of the network model as well as the code of elektronn3 itself that was used for running a training experiment should be archived to the
save_path
. For reference, here is the equivalent feature implemented in ELEKTRONN2: https://github.com/ELEKTRONN/ELEKTRONN2/blob/27ed6c9a07cdd65c5789697013d568060a392514/elektronn2/training/trainutils.py#L651. Note that we can't implement this in the same way, because the training script has to be included here. The back up routine has to be called from within the training script so it can include itself in addition to the code it uses.An open question is what exactly the back-up archive should contain. It's clear that the training script, the model code and the current elektronn3 source code (in case of user modification) should be in it, but what if the training script imports code from outside of elektronn3? It's probably also a good idea to include a summary of the system configuration (installed package versions, host name, name of the GPU that was used etc.).