Open mwoods-familiaris opened 21 hours ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
PR: https://github.com/apache/airflow/pull/43106 will also fix this issue
Apache Airflow Provider(s)
databricks
Versions of Apache Airflow Providers
apache-airflow-providers-databricks==6.13.*
Apache Airflow version
2.10.2
Operating System
Debian GNU/Linux 12 (bookworm)
Deployment
Astronomer
Deployment details
No response
What happened
_get_databricks_task_id only cleanses the task id, ref: https://github.com/apache/airflow/blob/a9242844706ca117f86d22092109939dd56435ee/providers/src/airflow/providers/databricks/plugins/databricks_workflow.py#L67 https://github.com/apache/airflow/blob/a9242844706ca117f86d22092109939dd56435ee/providers/src/airflow/providers/databricks/operators/databricks.py#L1077
However, the dag_id may also contain
.
- so the replacement of.
with__
should be applied to the whole string, not just the task id portion, else periods placed in the dag name results in errors such as:(as the invalid chars are getting silently stripped by databricks, so the task key on the databricks side is
myairflowdagwithperiods__my_airflow_task
rather thanmy.airflow.dag.with.periods__my_airflow_task
)What you think should happen instead
The replacement of
.
with__
should be applied to the whole task key / run name string, not just the task id portionHow to reproduce
Use the affected operator(s) e.g. DatabricksNotebookOperator on a DAG which contains
.
in the dag_idAnything else
Every time
Are you willing to submit PR?
Code of Conduct