Something like a custom name for model checkpointing is currently not well supported in EGG.
This can be achieved with a CheckpointSaver callback when creating the Trainer instance and passing parameters checkpoint_path=opts.checkpoint_dir and prefix=f'{custom_name}' to the CheckpointSaver instance.
However, if --preemptable is set, another CheckpointSaver with the same checkpoint_path parameter will save checkpoints in the same --checkpoint_dir folder (see here), resulting in duplicate checkpoints that have the different names in the same folder.
I was thinking maybe we could have CheckpointSaver NOT created by Trainer by default when --checkpoint_dir is set but only when it is set together with --preemptable, and have users define CheckpointSaver in their own game. This could still result in duplicate checkpoints in the same folder if --preemptable is set AND the users define their own CheckpointSaver.
Another possible solution is to have a check when initializing Trainer: if an instance of CheckpointSaver is present in the list of callbacks, then regardless of the --preemptable flag another CheckpointSaver will NOT be generated.
I find the latter simpler and cleaner and am happy to do it if you agree @eugene-kharitonov
Something like a custom name for model checkpointing is currently not well supported in EGG.
This can be achieved with a CheckpointSaver callback when creating the Trainer instance and passing parameters
checkpoint_path=opts.checkpoint_dir
andprefix=f'{custom_name}'
to the CheckpointSaver instance.However, if
--preemptable
is set, another CheckpointSaver with the samecheckpoint_path
parameter will save checkpoints in the same--checkpoint_dir
folder (see here), resulting in duplicate checkpoints that have the different names in the same folder.I was thinking maybe we could have CheckpointSaver NOT created by Trainer by default when
--checkpoint_dir
is set but only when it is set together with--preemptable
, and have users define CheckpointSaver in their own game. This could still result in duplicate checkpoints in the same folder if--preemptable
is set AND the users define their own CheckpointSaver.Another possible solution is to have a check when initializing Trainer: if an instance of CheckpointSaver is present in the list of callbacks, then regardless of the
--preemptable
flag another CheckpointSaver will NOT be generated.I find the latter simpler and cleaner and am happy to do it if you agree @eugene-kharitonov