Closed robert-verkuil closed 3 years ago
Hey @robert-verkuil , actually one way to do it is
dora run -d -f{SIGNATURE}
with the signature taken from the first item on the grid. This will be equivalent to what you have been doing manually.
I usually have a unused parameter in my config, that I called dummy, specifically for avoiding collision with the main XPs, so I would do something like
dora run -d -f{SIGNATURE} dummy=debug
and then you can just monitor this XP to see if things goes as planned. You don't need the dummy parameter if you are just going to quickly run the XP and kill it as soon as the main one actually gets scheduled (this will prevent the two overriding each other checkpoints etc).
And you can use the same trick if an XP has failed in a grid. Just take the signature for it, and run dora run -f{SIG} -d
, and it will resume the experiment from the last checkpoint and let you debug it locally. Once you are happy with the fix, you can restarted the failed xp in the grid with dora grid grid_name -r
.
Closing the task, feel free to reopen if you think the solution I offered is not sufficient for your use case :)
yes, thank you! sorry for not responding. You were really helpful above. I've been:
-f{SIG}
and that's been good!dummy
parameter is a good call as well
I've been using Dora recently and it's been great. One thing that one help my usage is an easy way to run e.g. the first xp of a grid locally, for debugging purposes. This would be helpful for large, complex sweeps, to quickly squash issues without waiting for xps to schedule.
(So far, as a workaround, I've been printing
launcher._argv
and then doingdora run ${launcher._argv}
.)