Closed wistuba closed 1 year ago
Thanks for the report, I'm rebuilding the docs manually now to see if that fixes the search
Looks like a regression was introduced into the docs that broke search, looking into this. As an immediate mitigation, you can search using this older docs build: https://syne-tune--587.org.readthedocs.build/en/587/
See for example this search result for SchedulerDecision: https://syne-tune--587.org.readthedocs.build/en/587/_apidoc/syne_tune.optimizer.scheduler.html#syne_tune.optimizer.scheduler.SchedulerDecision
Hello, SchedulerDecision
is what is returned by scheduler.on_trial_result
, which is triggered by a reported result being received. The value STOP
means the running trial should be stopped.
Now, it could be that the trial has already finished, because it may finish just after returning the last report, because it was in the final epoch. In this case, status == Status.completed
, and we don't have to stop it.
So, the line you ask about is reached when the scheduler, as reaction to a reported results, asks to stop the trial, but it has not yet finished on its own. We then ask the backend to stop it.
SchedulerDecision
is really simple, but Status
is a bit more tricky, because there is stopped
and stopping
. David did this. I tend to ignore stopping
, it may not really ever be used. For a SageMaker job, there is stopping
, which is the state between active
and stopped
.
@wesk Why would search in the docs be broken?
Are you saying that a Python process will not be killed as soon as it reports for the final epoch? Or are you saying it won't be killed if it is already dead? (local backend)
The second, I think. We only observe the Status.completed
value when the job really ends. And in that case, we do not have to stop it again. But David wrote this, so I am a bit guessing. But I am pretty sure
@wesk This issue is now primarily about the search not working in our docs. It works with old sphinx dependencies. I am checking what happens locally when the dependencies are re-installed
OK. I can confirm that:
main
are built with "old" virtual env, the search worksmain
are built after creating a new venv, search fails (just see "Searching ..." forever)I don't find any recent reports of search failing in sphinx. Dropping the ball here
OK, broken search is fixed by #602 (thanks, Martin!), and docs improved in #603
Search on RTD is broken, therefore I have to ask here whether a documentation exist.
I'm interested to understand when this line is reached: https://github.com/awslabs/syne-tune/blob/97cefe99686397b10a1e4a4e7e3ca6f66071cb95/syne_tune/tuner.py#L585
Is there a documentation for Status and SchedulerDecision?