hubmapconsortium / ingest-pipeline

Data ingest pipeline(s) for QA/metadata etl/post-processing
MIT License
4 stars 5 forks source link

Why isn't set_dataset_error firing for salmon_rnaseq_10x_sn? #617

Open jswelling opened 2 years ago

jswelling commented 2 years ago
Dependencies Blocking Task From Getting Scheduled

Dependency | Reason
-- | --
Task Instance State | Task is in the 'skipped' state which is not a valid state for execution. The task must be cleared in order to be run.
Not Previously Skipped | Skipping because of previous XCom result from parent task maybe_keep_cwl2
Dagrun Running | Task instance's dagrun was not in the 'running' state but in the state 'success'.
Task Instance Attributes

Attribute | Value
-- | --
dag_id | salmon_rnaseq_10x_sn
duration | None
end_date | 2022-06-25 19:14:49.488258+00:00
execution_date | 2022-06-24T20:08:20.535782+00:00
executor_config | {}
generate_command | <function TaskInstance.generate_command at 0x7f5d1a522d90>
hostname |  
is_premature | False
job_id | None
key | ('salmon_rnaseq_10x_sn', 'set_dataset_error', <Pendulum [2022-06-24T20:08:20.535782+00:00]>, 1)
log | <Logger airflow.task (INFO)>
log_filepath | /hive/users/hive/hubmap/hivevm193-prod/ingest-pipeline/src/ingest-pipeline/airflow/logs/salmon_rnaseq_10x_sn/set_dataset_error/2022-06-24T20:08:20.535782+00:00.log
log_url | http://hivevm193.psc.edu/admin/airflow/log?execution_date=2022-06-24T20%3A08%3A20.535782%2B00%3A00&task_id=set_dataset_error&dag_id=salmon_rnaseq_10x_sn
logger | <Logger airflow.task (INFO)>
mark_success_url | http://hivevm193.psc.edu/success?task_id=set_dataset_error&dag_id=salmon_rnaseq_10x_sn&execution_date=2022-06-24T20%3A08%3A20.535782%2B00%3A00&upstream=false&downstream=false
max_tries | 1
metadata | MetaData(bind=None)
next_try_number | 1
operator | PythonOperator
pid | None
pool | default_pool
pool_slots | 1
prev_attempted_tries | 0
previous_execution_date_success | 2022-06-24 19:59:43.918153+00:00
previous_start_date_success | 2022-06-25 18:17:02.466898+00:00
previous_ti | <TaskInstance: salmon_rnaseq_10x_sn.set_dataset_error 2022-06-24 19:59:43.918153+00:00 [skipped]>
previous_ti_success | <TaskInstance: salmon_rnaseq_10x_sn.set_dataset_error 2022-06-24 19:59:43.918153+00:00 [skipped]>
priority_weight | 3
queue | general_prod
queued_dttm | None
raw | False
run_as_user | None
start_date | 2022-06-25 19:14:49.488223+00:00
state | skipped
task | <Task(PythonOperator): set_dataset_error>
task_id | set_dataset_error
test_mode | False
try_number | 1
unixname | hive
Task Attributes

Attribute | Value
-- | --
dag | <DAG: salmon_rnaseq_10x_sn>
dag_id | salmon_rnaseq_10x_sn
depends_on_past | False
deps | {<TIDep(Trigger Rule)>, <TIDep(Previous Dagrun State)>, <TIDep(Not Previously Skipped)>, <TIDep(Not In Retry Period)>}
do_xcom_push | True
downstream_list | [<Task(JoinOperator): join>]
downstream_task_ids | {'join'}
email | ['joel.welling@gmail.com']
email_on_failure | False
email_on_retry | False
end_date | None
execution_timeout | None
executor_config | {}
extra_links | []
global_operator_extra_link_dict | {}
inlets | []
lineage_data | None
log | <Logger airflow.task.operators (INFO)>
logger | <Logger airflow.task.operators (INFO)>
max_retry_delay | None
on_failure_callback | <function create_dataset_state_error_callback.<locals>.set_dataset_state_error at 0x7f5c8f796400>
on_retry_callback | None
on_success_callback | None
op_args | []
op_kwargs | {'dataset_uuid_callable': <function get_dataset_uuid at 0x7f5c9287a9d8>, 'ds_state': 'Error', 'message': 'An error occurred in salmon-rnaseq'}
operator_extra_link_dict | {}
operator_extra_links | ()
outlets | []
owner | hubmap
params | {}
pool | default_pool
pool_slots | 1
priority_weight | 1
priority_weight_total | 3
provide_context | True
queue | general_prod
resources | None
retries | 1
retry_delay | 0:01:00
retry_exponential_backoff | False
run_as_user | None
schedule_interval | None
shallow_copy_attrs | ('python_callable', 'op_kwargs')
sla | None
start_date | 2019-01-01T00:00:00+00:00
subdag | None
task_concurrency | None
task_id | set_dataset_error
task_type | PythonOperator
template_ext | []
template_fields | ('templates_dict', 'op_args', 'op_kwargs')
templates_dict | None
trigger_rule | all_done
ui_color | #ffefeb
ui_fgcolor | #000
upstream_list | [<Task(BranchPythonOperator): maybe_keep_cwl4>, <Task(BranchPythonOperator): maybe_keep_cwl3>, <Task(BranchPythonOperator): maybe_keep_cwl2>, <Task(BranchPythonOperator): maybe_keep_cwl1>]
upstream_task_ids | {'maybe_keep_cwl4', 'maybe_keep_cwl3', 'maybe_keep_cwl2', 'maybe_keep_cwl1'}
wait_for_downstream | False
weight_rule | downstream
jswelling commented 2 years ago

It looks like set_dataset_error is happening correctly when the workflow fails at the maybe_keep_cwl1 step, but failing to trigger when the workflow fails at the maybe_keep_cwl3 step. Contrast the column just to the left of the 'June' label with the columns to the right of 'June' in this pic.
image

jswelling commented 2 years ago

This may have been due to the fact that the version of Airflow in use was 1.10.12, while our default is 1.10.15. Verify that the problem still exists now that 1.10.15 has been deployed on PROD.