Open ZhiyuanChen opened 3 years ago
@ZhiyuanChen could you try the latest nni version, i.e., v2.4
@QuanluZhang Ta for the information, I just found some machines are running with 2.4 while some are running with 2.3, I just upgrade all machines and ensured they are all running with 2.4. I'll let you know should it work
This seems to be not only related to remote mode. I just tried it in local mode, and it keeps running after received final result
yes, it keeps running until submitted trials are more than maxTrialNumber or the experiment duration exceeds maxExperimentDuration.
yes, it keeps running until submitted trials are more than maxTrialNumber or the experiment duration exceeds maxExperimentDuration.
What I means is the trial keeps running after received final results (which suggests it has stopped and should start a new trial
yes, it keeps running until submitted trials are more than maxTrialNumber or the experiment duration exceeds maxExperimentDuration.
What I means is the trial keeps running after received final results (which suggests it has stopped and should start a new trial
could you check that is the trial process still there, or the trial process has finished but webui shows it is still running? If it is the former, the problem is mainly in your trial code, your trial is blocked. If it is the latter, then it is a bug of NNI
@ZhiyuanChen - had you got a chance try this out? is the problem still occurring on your side?
yes, it keeps running until submitted trials are more than maxTrialNumber or the experiment duration exceeds maxExperimentDuration.
What I means is the trial keeps running after received final results (which suggests it has stopped and should start a new trial
could you check that is the trial process still there, or the trial process has finished but webui shows it is still running? If it is the former, the problem is mainly in your trial code, your trial is blocked. If it is the latter, then it is a bug of NNI
After report final result and exit, nni does nothing. On portal, it received the final results but still mark experiment as running. When manually stopped, it cleans up and wait infinitely.