machine-learning-apps / ml-template-azure

Template for getting started with automated ML Ops on Azure Machine Learning
MIT License
126 stars 87 forks source link

HTTP404 after the PythonScriptStep is finished #16

Open amaphadk opened 3 years ago

amaphadk commented 3 years ago

Hi,

We have a situation where submitting a Python pipeline run through gitactions, gitaction is detecting a non-zero exit code even when the underlying python script exited with a finished-zero code. Even a simple print statement step would finish but the action would report it as failed.

For example - python step:

def main(): print('AMAR Inside STEP 1 Choose Data Test')

abbreviated python code for pipeline build for submission - this is the script that aml-run action calls to build pipeline.

pipeline_steps = StepSequence(steps=[step_1_choose_data])
pipeline = Pipeline(workspace=workspace, steps=pipeline_steps)
pipeline.validate()
return pipeline

===

Individual step output

[2021-11-02T11:10:11.272611] The experiment completed successfully. Finalizing run... Cleaning up all outstanding Run operations, waiting 900.0 seconds 3 items cleaning up... Cleanup took 0.1468954086303711 seconds [2021-11-02T11:10:11.547600] Finished context manager injector. 2021/11/02 11:10:13 Attempt 1 of http call to http://[REDACTED]/sendlogstoartifacts/status 2021/11/02 11:10:13 Send process info logs to master server succeeded 2021/11/02 11:10:13 Not exporting to RunHistory as the exporter is either stopped or there is no data. Stopped: false OriginalData: 3 FilteredData: 0. 2021/11/02 11:10:13 Process Exiting with Code: 0 2021/11/02 11:10:14 All App Insights Logs was sent successfully or the close timeout of 10 was reached

BUT, action output/report

Action output: StepRun(STEP_1_Choose_Data) Execution Summary 232============================================== 233StepRun( STEP_1_Choose_Data ) Status: Finished 234Traceback (most recent call last): 235 File "/usr/local/lib/python3.8/site-packages/dotnetcore2/runtime.py", line 271, in attempt_get_deps 236 blob_deps_to_file() 237 File "/usr/local/lib/python3.8/site-packages/dotnetcore2/runtime.py", line 263, in blob_deps_to_file 238 blob = request.urlopen(deps_url, context=ssl_context) 239 File "/usr/local/lib/python3.8/urllib/request.py", line 222, in urlopen 240 return opener.open(url, data, timeout) 241 File "/usr/local/lib/python3.8/urllib/request.py", line 531, in open 242 response = meth(req, response) 243 File "/usr/local/lib/python3.8/urllib/request.py", line 640, in http_response 244 response = self.parent.error( 245 File "/usr/local/lib/python3.8/urllib/request.py", line 569, in error 246 return self._call_chain(args) 247 File "/usr/local/lib/python3.8/urllib/request.py", line 502, in _call_chain 248 result = func(args) 249 File "/usr/local/lib/python3.8/urllib/request.py", line 649, in http_error_default 250 raise HTTPError(req.full_url, code, msg, hdrs, fp) 251urllib.error.HTTPError: HTTP Error 404: Not Found 252 253During handling of the above exception, another exception occurred: 254 255Traceback (most recent call last): 256 File "/code/main.py", line 240, in 257 main() 258 File "/code/main.py", line 187, in main 259 run.wait_for_completion(show_output=True) 260 File "/usr/local/lib/python3.8/site-packages/azureml/pipeline/core/run.py", line 294, in wait_for_completion 261 step_run.wait_for_completion(timeout_seconds=timeout_seconds - time_elapsed, 262 File "/usr/local/lib/python3.8/site-packages/azureml/pipeline/core/run.py", line 736, in wait_for_completion 263 return self._stream_run_output(timeout_seconds=timeout_seconds, 264 File "/usr/local/lib/python3.8/site-packages/azureml/pipeline/core/run.py", line 827, in _stream_run_output 265 print(final_details) 266 File "/usr/local/lib/python3.8/site-packages/azureml/data/_loggerfactory.py", line 129, in wrapper 267 return func(*args, *kwargs) 268 File "/usr/local/lib/python3.8/site-packages/azureml/data/abstract_dataset.py", line 766, in repr 269 steps = self._dataflow._get_steps() 270 File "/usr/local/lib/python3.8/site-packages/azureml/data/_loggerfactory.py", line 129, in wrapper 271 return func(args, *kwargs) 272 File "/usr/local/lib/python3.8/site-packages/azureml/data/abstract_dataset.py", line 218, in _dataflow 273 dataprep().api._datastore_helper._set_auth_type(self._registration.workspace) 274 File "/usr/local/lib/python3.8/site-packages/azureml/dataprep/api/_datastore_helper.py", line 185, in _set_auth_type 275 get_engine_api().set_aml_auth(SetAmlAuthMessageArgument(auth_type, json.dumps(auth_value))) 276 File "/usr/local/lib/python3.8/site-packages/azureml/dataprep/api/engineapi/api.py", line 19, in get_engine_api 277 _engine_api = EngineAPI() 278 File "/usr/local/lib/python3.8/site-packages/azureml/dataprep/api/engineapi/api.py", line 110, in init 279 self._message_channel = launch_engine() 280 File "/usr/local/lib/python3.8/site-packages/azureml/dataprep/api/engineapi/engine.py", line 333, in launch_engine 281 dependencies_path = runtime.ensure_dependencies() 282 File "/usr/local/lib/python3.8/site-packages/dotnetcore2/runtime.py", line 285, in ensure_dependencies 283 if not attempt_get_deps(): 284 File "/usr/local/lib/python3.8/site-packages/dotnetcore2/runtime.py", line 279, in attempt_get_deps 285 raise NotImplementedError(err_msg + '\n' + _unsupported_help_msg) 286NotImplementedError: Linux distribution debian 11. does not have automatic support. 287.NET Core 2.1 can still be used via dotnetcore2 if the required dependencies are installed. 288Visit https://aka.ms/dotnet-install-linux for Linux distro specific .NET Core install instructions. 289Follow your distro specific instructions to install `dotnet-runtime-and replace*with2.1`. 290

Happy to discuss details.