Azure / MachineLearningNotebooks

Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
https://docs.microsoft.com/azure/machine-learning/service/
MIT License
4.1k stars 2.52k forks source link

AzureMLCompute job failed. JobFailed: Submitted script failed with a non-zero exit code; see the driver log file for details. Reason: Job failed with non-zero exit Code #1360

Closed Carterbouley closed 3 years ago

Carterbouley commented 3 years ago

Simply running the notebook cartpole_sc, exactly following the code with only changes to compute name, throws up teh following error:

AzureMLCompute job failed. JobFailed: Submitted script failed with a non-zero exit code; see the driver log file for details. Reason: Job failed with non-zero exit Code.

The logs show:

2021-02-23T15:32:05.5282148Z][Info]Starting reinforcement learning run with id CartPole-v0-SC_1614094323_113e09a5. [2021-02-23T15:32:11.4530989Z][Info]Starting head node child run with id CartPole-v0-SC_1614094323_113e09a5_head. [2021-02-23T15:33:21.2314948Z][Info]Some child runs have reached terminal state. All active child runs will be cancelled. The run Ids that reached terminal state are: CartPole-v0-SC_1614094323_113e09a5_head.

How can I get this example working, I don't understand how it doesn't work with basically zero changes?

jeffrey-d-lipkowitz commented 3 years ago

I'm running into a similar issue

gojira commented 3 years ago

Hi @Carterbouley, are you able to find the 70_driver_log.txt file in the Child run?

v-strudm-msft commented 3 years ago

Since there wasn't a post after our advice, we'll closed this issue for now. Should there still be a question, please reopen this issue. Thank you.