ELEKTRONN / elektronn3

A PyTorch-based library for working with 3D and 2D convolutional neural networks, with focus on semantic segmentation of volumetric biomedical image data
MIT License
160 stars 27 forks source link

Back up training code to save_path for more reproducible experiments #11

Closed mdraw closed 6 years ago

mdraw commented 6 years ago

The code of the training script, of the network model as well as the code of elektronn3 itself that was used for running a training experiment should be archived to the save_path. For reference, here is the equivalent feature implemented in ELEKTRONN2: https://github.com/ELEKTRONN/ELEKTRONN2/blob/27ed6c9a07cdd65c5789697013d568060a392514/elektronn2/training/trainutils.py#L651. Note that we can't implement this in the same way, because the training script has to be included here. The back up routine has to be called from within the training script so it can include itself in addition to the code it uses.

An open question is what exactly the back-up archive should contain. It's clear that the training script, the model code and the current elektronn3 source code (in case of user modification) should be in it, but what if the training script imports code from outside of elektronn3? It's probably also a good idea to include a summary of the system configuration (installed package versions, host name, name of the GPU that was used etc.).

mdraw commented 6 years ago

Reference for getting relevant environment info: https://github.com/pytorch/pytorch/blob/d93d41b2ef0cc3a986aeb3db56935798beaa3f63/torch/utils/bottleneck/__main__.py#L88. This should be extended and added to our backup archives.

mdraw commented 6 years ago

We should use this instead of the code referenced in my previous comment: https://github.com/pytorch/pytorch/pull/6635.