Open jennkimm opened 3 years ago
Use the argument --resume_iter, which is called in solver.py.
if args.resume_iter > 0:
self._load_checkpoint(args.resume_iter)
and need to set the initial value for --lambda_ds because of line 95 and line 143 of core/solver.py
initial_lambda_ds = args.lambda_ds
I've trained this model, but It stops when 30000th iterations due to out of device memory.
Since we have a limited budgets, we'd like to know if we can resume training from our 20000 iters checkpoints.