Epistimio / orion

Asynchronous Distributed Hyperparameter Optimization.
https://orion.readthedocs.io
Other
285 stars 52 forks source link

Improve error message when user script fails #709

Open bouthilx opened 2 years ago

bouthilx commented 2 years ago

The current error message is misleading: orion.core.worker.consumer.ExecutionError: Something went wrong. Check logs. Process returned with code -9 !

First, there are no clear mentions that the issue is that the user's script crashed. This should be clearly mentioned. Second, Check logs sounds like it Oríon was writting some logs somewhere. It is not the case. The error message should only state the the user's script failed, and the user will know that it should look at the output of the script to see if there is an error message.

On the same topic, is the error message from the script printed in the terminal? For example if the user's script is written in Python, would the stack-trace be printed? It should be.

bouthilx commented 2 years ago

Error message was changed in PR #684. We would need to verify that the new error message is useful and adapt it if necessary. Otherwise most important thing is to verify that the call to subprocess.Popen is not catching all stdin/stderr so that the script's stack-trace is visible in the terminal for the user.