Closed eladkal closed 1 year ago
DbtCloudRunJobOperator
it raises:[2023-09-08, 21:36:05 UTC] {base.py:152} ERROR - OpenLineage provider method failed to extract data from provider.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/airflow/providers/openlineage/extractors/base.py", line 137, in _get_openlineage_facets
facets: OperatorLineage = get_facets_method(*args)
File "/usr/local/lib/python3.8/site-packages/airflow/providers/dbt/cloud/operators/dbt.py", line 229, in get_openlineage_facets_on_complete
return generate_openlineage_events_from_dbt_cloud_run(operator=self, task_instance=task_instance)
File "/usr/local/lib/python3.8/site-packages/airflow/providers/dbt/cloud/utils/openlineage.py", line 60, in generate_openlineage_events_from_dbt_cloud_run
run_id=operator.run_id, account_id=operator.account_id, include_related=["run_steps,job"]
AttributeError: 'DbtCloudRunJobOperator' object has no attribute 'run_id'
I'll raise an issue and try to solve this.
@task
decorated operators, but I don't think that's a big deal since I don't think these kinds of tasks by themselves are that useful for lineage data, so I think this one is fine. @mobuchowski what do you think?@RNHTTR what if you disable _PythonDecoratedOperator
?
I checked all my 101 changes, and they are all present in the RC.
Tested https://github.com/apache/airflow/pull/33825, https://github.com/apache/airflow/pull/33822, https://github.com/apache/airflow/pull/34098
Will we be able to include documentation changes like https://github.com/apache/airflow/pull/34104, https://github.com/apache/airflow/pull/34103, https://github.com/apache/airflow/pull/34102, https://github.com/apache/airflow/pull/34101, https://github.com/apache/airflow/pull/34097, https://github.com/apache/airflow/pull/34096, https://github.com/apache/airflow/pull/34095, https://github.com/apache/airflow/pull/34094, https://github.com/apache/airflow/pull/34074, https://github.com/apache/airflow/pull/34073?
Tested my change https://github.com/apache/airflow/pull/34018 in the Google RC 10.8.0rc1. It works fine, but, it also has a dependency on common-sql provider 1.7.2.rc1 for the change in same PR. If the common-sql provider is not updated then it fails. How do we handle cross-provider dependency bumps during releases? Does it get handled automatically or we need a manual minimum dependency bump here in Google RC to contain common-sql>=1.7.2?
it also has a dependency on common-sql provider 1.7.2.rc1 for the change in same PR. If the common-sql provider is not updated then it fails.
IMHO if the operator will be broken before upgrading the common-sql version to latest, then we should consider it as breaking change and fix it. The min version of common-sql is 1.3.1 in google provider, bumping it to 1.7.2 could fix the issue, and I think that it's safe as we still use the same major version.
Yep. We should bump min version of common-sql
Thank you @hussein-awala and @potiuk for your quick inputs and suggestions. I have created a PR now to bump the min version https://github.com/apache/airflow/pull/34257.
cc: @eladkal Sorry din't realise this earlier. What would be the steps for releasing the Google RC now as it may depend on PR https://github.com/apache/airflow/pull/34257?
Checked all my changes are in (mostly dependencies). All looks good.
dbt.cloud
will be excluded from this wave due to issues found.
please keep testing rest of the providers
https://github.com/apache/airflow/pull/33959 bug does not affect OpenLineage provider, just dbt one - we should only exclude this. I will doublecheck OL provider.
EDIT: https://github.com/apache/airflow/pull/34270 should fix this issue, maybe release from RC2?
33959 bug does not affect OpenLineage provider, just dbt one - we should only exclude this. I will doublecheck OL provider.
EDIT: #34270 should fix this issue, maybe release from RC2?
I will cut rc2 for dbt.cloud
Body
Issue title: Status of testing Providers that were prepared on September 08, 2023
I have a kind request for all the contributors to the latest provider packages release. Could you please help us to test the RC versions of the providers?
The guidelines on how to test providers can be found in
Verify providers by contributors
Let us know in the comment, whether the issue is addressed.
Those are providers that require testing as there were some substantial changes introduced:
Provider airbyte: 3.3.2rc1
Provider alibaba: 2.5.3rc1
Provider amazon: 8.7.0rc1
waiter_max_attempts
default value inEcsRunTaskOperator
(#33712): @mjsquAppflowHook
(#33881): @Taragolisaws.session_factory
part of Amazon provider configuration documentation (#33960): @TaragolisProvider apache.beam: 5.2.3rc1
Provider apache.drill: 2.4.4rc1
Provider apache.flink: 1.1.3rc1
Provider apache.hdfs: 4.1.1rc1
Provider apache.hive: 6.1.6rc1
format
in Airflow providers (#33752): @hussein-awalaProvider apache.impala: 1.1.3rc1
Provider apache.livy: 3.5.4rc1
Provider apache.pig: 4.1.2rc1
Provider apache.pinot: 4.1.4rc1
Provider apprise: 1.0.2rc1
Provider celery: 3.3.4rc1
Provider cncf.kubernetes: 7.5.1rc1
cached_property
for hook in SparkKubernetesSensor (#34106): @josh-fellProvider common.sql: 1.7.2rc1
Provider databricks: 4.5.0rc1
format
in Airflow providers (#33752): @hussein-awalaProvider dbt.cloud: 3.3.0rc1
Provider docker: 3.7.5rc1
Provider elasticsearch: 5.0.2rc1
format
in Airflow providers (#33752): @hussein-awalaProvider exasol: 4.2.5rc1
Provider facebook: 3.2.2rc1
Provider ftp: 3.5.2rc1
Provider google: 10.8.0rc1
BigQueryHook.get_pandas_df
(#33819): @RyuSAformat
in Airflow providers (#33752): @hussein-awalaProvider hashicorp: 3.4.3rc1
Provider http: 4.5.2rc1
Provider imap: 3.3.2rc1
Provider influxdb: 2.2.3rc1
Provider jdbc: 4.0.2rc1
Provider jenkins: 3.3.2rc1
Provider microsoft.azure: 7.0.0rc1
AzureDataFactoryPipelineRunStatusAsyncSensor
class (#34036): @eladkalLocalToAzureDataLakeStorageOperator
class (#34035): @eladkalProvider microsoft.psrp: 2.3.2rc1
Provider mongo: 3.2.2rc1
Provider mysql: 5.3.1rc1
Provider neo4j: 3.3.3rc1
Provider openlineage: 1.1.0rc1
get_custom_facets
(#34122): @JDarDagranProvider oracle: 3.7.4rc1
Provider pagerduty: 3.3.1rc1
Provider postgres: 5.6.1rc1
Provider presto: 5.1.4rc1
Provider redis: 3.3.2rc1
Provider salesforce: 5.4.3rc1
Provider samba: 4.2.2rc1
Provider sftp: 4.6.1rc1
Provider slack: 8.1.0rc1
Provider smtp: 1.3.2rc1
Provider snowflake: 5.0.1rc1
Provider ssh: 3.7.3rc1
Provider tableau: 4.2.2rc1
Provider trino: 5.3.1rc1
Provider vertica: 3.5.2rc1
Provider zendesk: 4.3.2rc1
All users involved in the PRs: @Taragolis @fabiogra @vijay-jangir @darkag @okayhooni @moiseenkov @RyuSA @potiuk @yermalov-here @eladkal @bkossakowska @hussein-awala @pankajastro @adam133 @kristopherkane @pankajkoti @melugoyal @pierrejeambrun @JDarDagran @GeoffroyDFox @RNHTTR @mjsqu @dstandish @josh-fell @wolfdn @Lee-W
Committer