apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
35.3k stars 13.79k forks source link

Some hacks in Google provider #13351

Open mik-laj opened 3 years ago

mik-laj commented 3 years ago

Hello,

While updating Google libraries, we found some bugs in google-cloud-* libraries. Fortunately, we managed to get around some of them using various tricks and hacks. However, this may cause various problems in the future after updating the new library, e.g. the hack will not support new features provided by the updates. We should report problems to the library authors, record the use of hacks, track the progress of tickets, so that we can remove hacks in the future.

Each hack is marked with a comment that have prefix HACK:, so you can find them easily, but here is also a list of these issues.

google-cloud-python: https://github.com/apache/airflow/blob/9a1d3820d6f1373df790da8751f25e723f9ce037/airflow/providers/google/cloud/hooks/datacatalog.py#L1024 https://github.com/apache/airflow/blob/9a1d3820d6f1373df790da8751f25e723f9ce037/airflow/providers/google/cloud/hooks/datacatalog.py#L1101 https://github.com/apache/airflow/blob/9a1d3820d6f1373df790da8751f25e723f9ce037/airflow/providers/google/cloud/hooks/datacatalog.py#L1024 https://github.com/apache/airflow/blob/9235af594c54d4a2991c494929deef37eee33ac6/airflow/providers/google/cloud/hooks/datacatalog.py#L232-L234 (unmerged PR: https://github.com/apache/airflow/pull/13224) https://github.com/apache/airflow/blob/9235af594c54d4a2991c494929deef37eee33ac6/airflow/providers/google/cloud/hooks/datacatalog.py#L295-L297 (unmerged PR: https://github.com/apache/airflow/pull/13224)

Ticket: https://github.com/googleapis/python-datacatalog/issues/84

google-cloud-bigquery-datatransfer: https://github.com/apache/airflow/blob/10ffc7bfd58f3fa0eea54262f20aeffe1ab72592/airflow/providers/google/cloud/hooks/bigquery_dts.py#L82-L83 (unmerged PR: https://github.com/apache/airflow/pull/13337/files)

Ticket: https://github.com/googleapis/python-bigquery-datatransfer/issues/90

Best regards, Kamil Breguła

eladkal commented 2 years ago

@pierrejeambrun maybe you'll be interested in trying to solve some of the hacks? (probably we can't solve all of them)

pierrejeambrun commented 2 years ago

Hello @eladkal,

Thank you for pointing me to this issue. I will take a look and see if we can remove some of them. :)

pierrejeambrun commented 2 years ago

Hello @eladkal,

I have taken a look at it and I do not see an easy way to fix this for now:

I think we should leave it as is for now, let me know what you think.

Best,

eladkal commented 2 years ago

yeah we have a lot of work with updating the google provider see https://github.com/apache/airflow/issues/22111#issuecomment-1063433467

You can send me message on slack if you need help with finding another issue to work on

github-actions[bot] commented 12 months ago

This issue has been automatically marked as stale because it has been open for 365 days without any activity. There has been several Airflow releases since last activity on this issue. Kindly asking to recheck the report against latest Airflow version and let us know if the issue is reproducible. The issue will be closed in next 30 days if no further activity occurs from the issue author.

eladkal commented 12 months ago

Recently we updated many google versions. I think it's a good time to revisit if we can address it