pwollstadt / IDTxl

The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory.
http://pwollstadt.github.io/IDTxl/
GNU General Public License v3.0
237 stars 76 forks source link

checkpointing: make code aware of changed GPU device on resume #71

Open mwibral opened 3 years ago

mwibral commented 3 years ago

I just noticed that resuming from a checkpoint with the current code will assume tht it is running on the same GPU (number) as before. E.g. if the process started on GPU #0, after resume it will assume it is still running on GPU #0. However if you work on a cluster and your resume command goes through eh queuing system, then the process may well run on GPU#1,2,3 etc. after resume.

Fix: Provide the GPU number as an additional argument to the resume function.