Closed kenanEkici closed 1 year ago
Can't repro this issue... @kenanEkici could you please try submitting a pipeline job manually with force_rerun=True, to see if it works? And if it works well, you could schedule the existing run submitted to see if the issue can be workaround.
And could you please let me know your sdk version?
The attribute works when submitting the pipeline manually. When you schedule the same pipeline, the attribute gets ignored and each run (except the first one) gets cached. I'm working with version 2.12.1
Okay.. Is it convenient for you to share the schedule yaml and the pipeline job definition yaml? Need more info as I'm not able to repro it.
Hi, I'm not able to share the artifacts. Instead, I've created a basic set of components and pipelines to see if I could reproduce my issue.
_testscript.py
import argparse
import datetime
parser = argparse.ArgumentParser()
parser.add_argument("--test_input", type=str)
args = parser.parse_args()
test_input = args.test_input
print(test_input)
print(datetime.datetime.now())
component.yml
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
type: command
name: test_component
display_name: test_component
inputs:
test_input:
type: string
code: ./
environment: azureml:nec-env-dev:4
command: >-
python test_script.py
--test_input ${{inputs.test_input}}
pipeline.yml
$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline
display_name: test_pipeline
description: test_pipeline
experiment_name: test_experiment
inputs:
test_input: "Helloworld"
settings:
default_datestore: azureml:workspaceblobstore
default_compute: azureml:cpucls-d-aml-weu-dac
force_rerun: true
jobs:
test_component:
type: command
component: ./component.yml
inputs:
test_input: ${{parent.inputs.test_input}}
_testschedule.yml
$schema: https://azuremlschemas.azureedge.net/latest/schedule.schema.json
name: test_schedule
display_name: test_schedule
description: test_schedule
trigger:
type: cron
expression: "*/20 * * * *"
time_zone: "UTC"
create_job:
job: ./pipeline.yml
First I manually pushed the job.
az ml job create -f .\pipeline.yml
Wait until job is finished on AML
az ml job create -f .\pipeline.yml
Result: the output did not get cached which is expected behavior.
Then I schedule this pipeline to run every 20 minutes using _testschedule.yml.
az ml schedule create -f .\test_schedule.yml
The scheduled pipeline run also reran fully and did not cache the output of the previous run. This means that the force_rerun attribute was taken into account and I was not able to reproduce my own issue.
I will be gradually adding more attributes and source code from the original artifacts where I encountered the issue. I hope to identify what is causing the issue.
I'm closing this issue as I could not reproduce the bug.
@kenanEkici @brynn-code Hi, I would like to reopen this issue as I am facing the same problem.
sdk version: azure-ai-ml==1.10.0
I have a pipeline.yaml
pipeline definition. When I load it with azure.ai.ml.load_component
and run the following codes:
from azure.ai.ml import Input, load_component
pipeline = load_component(source=pipeline_definition_path)
# Defining pipeline inputs
pipeline_job = pipeline(
data_organizer_input_data=Input(path="azureml://datastores/test_datastore/paths/", mode="ro_mount"),
)
and print out the pipeline
and the pipeline_job
variables, you can see, that although, the settings
section from pipeline.yaml
can be found in pipeline
variable, it is completele removed from the pipeline_job
variable.
pipeline:
$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
name: pipeline_test
display_name: Pipeline Test
description: Pipeline test
type: pipeline
inputs:
data_organizer_input_data:
type: uri_folder
mode: ro_mount
jobs:
data_organizer_job:
type: command
inputs:
input_data:
path: ${{parent.inputs.data_organizer_input_data}}
component:
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
name: azureml_anonymous
version: '1'
display_name: Data Organizer
description: Organizing the input Data Asset in a way that it can be fed to
the model.
type: command
inputs:
input_data:
type: uri_folder
mode: ro_mount
outputs:
output_data:
type: uri_folder
mode: rw_mount
command: python data_organizer.py --input_data ${{inputs.input_data}} --output_data
${{outputs.output_data}}
environment: azureml:my_env@latest
code: <path_to_code>
is_deterministic: true
compute: azureml:my_compute_cluster
experiment_name: pipeline_test_experiment
settings:
default_compute: azureml:my_OTHER_compute_cluster
force_rerun: true
continue_on_step_failure: false
pipeline_job:
type: pipeline
inputs:
data_organizer_input_data:
mode: ro_mount
type: uri_folder
path: azureml://datastores/test_datastore/paths/
component:
$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
name: pipeline_test
display_name: Pipeline Test
description: Pipeline test
type: pipeline
inputs:
data_organizer_input_data:
type: uri_folder
mode: ro_mount
jobs:
data_organizer_job:
type: command
inputs:
input_data:
path: ${{parent.inputs.data_organizer_input_data}}
component:
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
name: azureml_anonymous
version: '1'
display_name: Data Organizer
description: Organizing the input Data Asset in a way that it can be fed to
the model.
type: command
inputs:
input_data:
type: uri_folder
mode: ro_mount
outputs:
output_data:
type: uri_folder
mode: rw_mount
command: python data_organizer.py --input_data ${{inputs.input_data}} --output_data
${{outputs.output_data}}
environment: azureml:my_env@latest
code: <path_to_code>
is_deterministic: true
compute: azureml:my_compute_cluster
Therefore, when I submit pipeline_job
with ml_client.jobs.create_or_update(job=pipeline_job, ...)
the created job uses the default values of the settings (force_rerun = False
, continue_on_step_failure: True
, etc.) instead of the values specified in pipeline.yaml
. (Note: I am aware that we can configure settings via pipeline_job.settings.force_rerun = True
, but I want to use my own settings (specified in pipeline.yaml
) by default when submitting a job and override them when needed.)
The same issue occurs when I do a pipeline batch deployment.
$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
name: test-pipeline
description: Test
endpoint_name: test-endpoint
type: pipeline
component: ./pipeline.yaml
settings:
continue_on_step_failure: False
force_rerun: True
I am facing the same issue! Are there any news on that?
Still having the same issue! Any news?
Guys, just do not confuse pipelineComponent and pipelineJob yaml schemas. Use pipelineComponent schema if you want to register a pipeline that later can be triggered by its version (e.g. via a pipelineJob schema yaml) and use pipelineJob schema if you want to create a pipeline job (so a pipeline that is being executed) from scratch or by referencing a registered pipeline.
Only pipelineJob schema has force_rerun, continue_on_step_failure etc. runtime attributes
also, for pipelineJob yamls, use load_job() (from azure.ai.ml) and for pipelineComponent yamls use load_component()
Hi,
I have defined a pipeline through YAML ($schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json) and I am using the CLI to run and schedule the pipeline job in Azure Machine Learning studio. With consecutive runs, the pipeline does not rerun as the YAML definition and the input does not change. This behavior is expected.
However, I want to disable this such that the pipeline is forced to rerun. To achieve this, I have set the force_rerun attribute to True in my pipeline YAML definition.
Unfortunately, the pipeline keeps reusing the output of previous runs and does not force rerunning the pipeline again. It seems as though the attribute is simply not being considered. In the Designer on Azure Machine Learning studio, you can clearly see this. The option to "Regenerate output" is also greyed out, meaning it cannot changed.![issue](https://user-images.githubusercontent.com/23141205/210814232-1486ff02-a223-4c87-bcfd-ba3701aecd0f.png)