Open Zach-Attach opened 3 months ago
It appears that this does not happen when running a single brain/env combo in a single run of the script
I'm interested in taking a deeper look at this as well, I'll keep this thread posted with what I find
Looks like the issue can be avoided when turning off checkpoints while running the script with a single job, waiting 120 seconds between the start of each run. It appears that part of the issue is related to saving checkpoints, so I have made saving checkpoints now optional.
Describe the bug Jobs Do not always complete OR do not even record anything in the first place.
To Reproduce Steps to reproduce the behavior: