Open leonieroos opened 1 year ago
route to CXP team
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github.
Author: | leonieroos |
---|---|
Assignees: | - |
Labels: | `Service Attention`, `Machine Learning`, `customer-reported`, `Auto-Assign` |
Milestone: | - |
Adding Service team to look into this.
@azureml-github Could you please look into this and provide an update ?
Hi team, any news on this? Thank you
@wangchao1230 Can you help to triage this issue? Thank you.
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @shbijlan.
Author: | leonieroos |
---|---|
Assignees: | - |
Labels: | `Service Attention`, `Machine Learning`, `customer-reported`, `ML-Pipelines`, `Auto-Assign` |
Milestone: | - |
@leonieroos Could you share your pipeline job name (id)?
A screenshot of the Job Overview panel will be helpful. Click the "Job overview" button on upper right of your current screenshot.
Hi @wangchao1230 job id: mango_turtle_zzbgpfx7r2
Thanks!
@leonieroos Hi, the syntax when writing a one-line command into multiple lines in yaml shall be:
command: >-
python aml_train.py
--data ${{inputs.data}}
--data_out ${{outputs.data_out}}
But in your yaml there are some leading white spaces:
command: >-
python aml_train.py
--data ${{inputs.data}}
--data_out ${{outputs.data_out}}
Could you please try to remove those leading whitespaces before arguments? They will be converted as \n
in yaml. You could refer to https://stackoverflow.com/questions/3790454/how-do-i-break-a-string-in-yaml-over-multiple-lines for more details, there is a table in the question answer.
@leonieroos And for the UI not showing status/error message issue, I am wondering if it's an issue with UI or run history index refresh issue. Could you confirm: if wait for a few mins and refresh UI will show the error message/status for you?
Hi @brynn-code , thank you for picking that up. I have changed it exactly to match the normal train step (command instead of sweep which works) and it still leaves the same issue.
@wangchao1230 , the refresh is not showing any thing different. However, the one I ran now is just stalled between the steps and indicates status not started at the pipeline job overview. This is been stalled now for 2 hours:
It seems like I am missing a detail? The UI is recognizing the sweepstep in the canvas as such and has the search space - when I leave out a variable from the component it does raises an error with the command of the component: so seems to seeing it as a command
@leonieroos Could you please elaborate more about 'changed it to match the normal train step'? The issue about multi-line command is not related to the component type, which means no matter the step is a command step or a sweep step, the 'command' field shall be right format for execution.
So I changed the type from sweep to command in the pipeline to the same component command and that works with putting the search space back to inputs.
Dear team,
I have a pipeline with a sweep component that stopped working and gives a overall failure because the sweep step never initiates so it leaves me without error message or logs.
The command with extension: az ml job create --file ./pipelines/pipeline_demandmodel_hp.yml
az version: { "azure-cli": "2.42.0", "azure-cli-core": "2.42.0", "azure-cli-telemetry": "1.0.8", "extensions": { "ml": "2.12.1" } } within the environment I have azure-ai-ml==1.1.0
I'm expecting the pipeline to produce child runs and trials for the parameters as it did a month ago but instead it gets stuck on never initiating the sweep step at all and after a while will 'fail'. I tried with a registered data set as in put as well as the data passed on from previous step (which will complete with green tick) and both have the same issue.
this is the sweep step:
#######
Component: