apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
40.54k stars 15.15k forks source link

Status of testing Providers that were prepared on May 14, 2025 #50599

Closed eladkal closed 3 weeks ago

eladkal commented 1 month ago

Body

I have a kind request for all the contributors to the latest provider distributions release. Could you please help us to test the RC versions of the providers?

The guidelines on how to test providers can be found in

Verify providers by contributors

Let us know in the comments, whether the issue is addressed.

These are providers that require testing as there were some substantial changes introduced:

Provider amazon: 9.8.0rc1

All users involved in the PRs: @ellisms @jscheffl @ashb @Dukastlik @gschuurman @dstandish @olegkachur-e @kacpermuda @dabla @pierrejeambrun @kandharvishnu @AutomationDev85 @majorosdonat @eladkal @potiuk @mobuchowski @sunank200 @yannla

Committer

pierrejeambrun commented 1 month ago

How can fab version be 2.0.1 ? PRs mentioned are targetting 2.0.3.

Also is it possible to not release fab 2.0.3 yet, so I have time to incorportate all the config changes that needs to impose a lower bound on airflow core. (core also has a lower bound on fab provider, maybe we cut it only before 3.0.2 ?)

eladkal commented 1 month ago

How can fab version be 2.0.1 ? PRs mentioned are targetting 2.0.3.

Also is it possible to not release fab 2.0.3 yet, so I have time to incorportate all the config changes that needs to impose a lower bound on airflow core. (core also has a lower bound on fab provider, maybe we cut it only before 3.0.2 ?)

The fact that contributors set versions is not ideal. We do that to make the CI green but ideally we should not. Release manager decide on the version and it's reviewed prior to release. In this case the PRs are more features than minor misc. There are 72 hours to test if you believe by then release shouldn't go through you can cast -1 vote.

pierrejeambrun commented 1 month ago

Are you going to manually edit "version_added" so it's targetted for 2.0.1 ?

My point was just to avoid going through another lower bound bump, but we can do that again.

eladkal commented 1 month ago

Are you going to manually edit "version_added" so it's targetted for 2.0.1 ?

My point was just to avoid going through another lower bound bump, but we can do that again.

The rc is 2.1.0 not 2.0.1 I changed the 2.0.3 settings to 2.1.0 when I prepared the release see https://github.com/apache/airflow/pull/50531

pierrejeambrun commented 1 month ago

Got it thanks for the details, sorry I'm not familiar with the provider release process, I just wanted to make sure things were updated.

eladkal commented 1 month ago

Got it thanks for the details, sorry I'm not familiar with the provider release process, I just wanted to make sure things were updated.

No worries. If there is any bug found / you wish to exclude fab to allow more testing time just cast -1 vote.

kacpermuda commented 1 month ago

All my PRs work well, also #49237 is ok.

sjyangkevin commented 4 weeks ago

Provider cohere: 1.5.0rc1

Test cases

Summary The PR works well.

If the implementation below is in the DAG code, since the get_conn is updated to a method #50470 , this code will result in error. co = CohereHook(conn_id=cohere_conn_id).get_conn Instead, the following will work. co = CohereHook(conn_id=cohere_conn_id).get_conn()

Not related to #50470, the system test (airflow/providers/cohere/tests/system/cohere/example_cohere_embedding_operator.py), the operator is having issue to push embedding to XCOM, resulting in TypeError: cannot serialize object of type <class 'cohere.types.embed_by_type_response_embeddings.EmbedByTypeResponseEmbeddings'>

Image

Please let me know if there is any fix needed.

potiuk commented 4 weeks ago

Got it thanks for the details, sorry I'm not familiar with the provider release process, I just wanted to make sure things were updated.

I think when contributor bumps the version, release manager should simply have a say - we rarely do that "quickly" so that there is no time for it. In this case we discussed it extensively in https://github.com/apache/airflow/pull/50208 but the problem is that we have not explicitly called @eladkal to make his call - we usually do in such cases.

And we should simply do it in the future. That was simply not a good idea to merge it before getting @eladkal 's OK.

We should simply continue doing it in the future.

potiuk commented 4 weeks ago

(And just to be clear - that was also my mistake - I usually remember calling @eladkal in such cases)l

eladkal commented 4 weeks ago

I marked it as feature release because it included Add ProxyFix Middleware for flask app builder which suggested it's a feature.

amoghrajesh commented 4 weeks ago

Not related to https://github.com/apache/airflow/pull/50470, the system test (airflow/providers/cohere/tests/system/cohere/example_cohere_embedding_operator.py), the operator is having issue to push embedding to XCOM, resulting in TypeError: cannot serialize object of type <class 'cohere.types.embed_by_type_response_embeddings.EmbedByTypeResponseEmbeddings'>

@sjyangkevin this is happening because for airflow 3, we use from airflow.serialization.serde import serialize module to serialize the data that is being sent to xcoms. The case seems to be that the EmbedByTypeResponseEmbeddings is a pydantic model which our serialize library doesn't know how to handle.

amoghrajesh commented 4 weeks ago

Tested my changes and ran a couple of other dags.

  1. https://github.com/apache/airflow/pull/49398 (dbt)
  2. https://github.com/apache/airflow/pull/49398 (microsoft)
  3. https://github.com/apache/airflow/pull/50521 for a CI fix
  4. https://github.com/apache/airflow/pull/50301 is for deprecation, works as expected. I see the deprecation
amoghrajesh commented 4 weeks ago

@sjyangkevin can you please confirm if the xcom push part worked in 2.10.x?

mvfc commented 4 weeks ago

tested all azure changes, all look good

gschuurman commented 4 weeks ago

49942 is not testable right now due to the airflow package requirement.

sjyangkevin commented 4 weeks ago

@sjyangkevin can you please confirm if the xcom push part worked in 2.10.x?

Thanks for the feedback. Will check it.

ellisms commented 4 weeks ago

50321 looks good.

potiuk commented 4 weeks ago

A comment for cncf.kubernetes -> we might want to consider rc2 with https://github.com/apache/airflow/pull/50651 in #41968 -> it turned out I broke cncf.kubernetes for Airlfow 2 and 10.4.3 is already broken, so technically speaking it's not a regression from 10.4.3 - however it makes the cncf.kubernetes unusable for Airflow 2, so might be good reason to include the fix.

DjVinnii commented 4 weeks ago

I just tested the KubernetesPodOperator for #50223. Unfortunately it looks like only configuring automount_service_account_token=False in the task works, but using the value from pod_template_file doesn't and still falls back to the default.

Update: It seems like the pod_template_file is used as a base and is overridden with the value from automount_service_account_token. Should this be automount_service_account_token: bool | None = None instead of automount_service_account_token: bool = True? And should create a fix for this?

simonprydden commented 4 weeks ago

anyone have access to pagerduty to test my change? #48919

guan404ming commented 4 weeks ago

Tested all my changes, all look good to me. Thanks @eladkal, this is a massive release.

potiuk commented 4 weeks ago

All good with my changes!

jscheffl commented 4 weeks ago

I explicitly tested the changes in Edge3 with our deployment in Bosch as well as I did a local breeze setup in Airflow 2.10.5 and current main - all seems to be working fine.

Wow, otherwise: massive release!

aditya-emmanuel commented 4 weeks ago

Tested out SqltoS3Operator:- https://github.com/apache/airflow/pull/50287 with different file formats and compression (csv,parquet,gzip). Changes look good. Thanks @james5418!

sjyangkevin commented 4 weeks ago

@sjyangkevin can you please confirm if the xcom push part worked in 2.10.x?

@amoghrajesh I iterated through multiple provider versions, the serialization issue was introduced after 1.4.0.

eladkal commented 3 weeks ago

It seems like the pod_template_file is used as a base and is overridden with the value from automount_service_account_token. Should this be automount_service_account_token: bool | None = None instead of automount_service_account_token: bool = True? And should create a fix for this?

@DjVinnii seems like a good idea. Can you raise PR?

eladkal commented 3 weeks ago

Thank you everyone. Providers are released. cncf.kubernetes is excluded from this wave.

I invite everyone to help improve providers for the next release, a list of open issues can be found here.

amoghrajesh commented 3 weeks ago

@sjyangkevin can you please confirm if the xcom push part worked in 2.10.x?

@amoghrajesh I iterated through multiple provider versions, the serialization issue was introduced after 1.4.0.

@sjyangkevin from the history of it, it looks like the cohere provider moved to pydantic based xcom response in 1.4.0? We use the serializers that we have built in https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/serialization/serde.py for serialization + deserialization of xcoms in airflow 3. Seems to me that the serializers arent able to encode the object right.

Feel free to create an issue, we can chat there. But pls confirm if the 1.4.0 release time frame changed the object type

sjyangkevin commented 3 weeks ago

@sjyangkevin can you please confirm if the xcom push part worked in 2.10.x?

@amoghrajesh I iterated through multiple provider versions, the serialization issue was introduced after 1.4.0.

@sjyangkevin from the history of it, it looks like the cohere provider moved to pydantic based xcom response in 1.4.0? We use the serializers that we have built in https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/serialization/serde.py for serialization + deserialization of xcoms in airflow 3. Seems to me that the serializers arent able to encode the object right.

Feel free to create an issue, we can chat there. But pls confirm if the 1.4.0 release time frame changed the object type

@amoghrajesh let me double check and will create an issue once confirm.

sjyangkevin commented 3 weeks ago

Not related to #50470, the system test (airflow/providers/cohere/tests/system/cohere/example_cohere_embedding_operator.py), the operator is having issue to push embedding to XCOM, resulting in TypeError: cannot serialize object of type <class 'cohere.types.embed_by_type_response_embeddings.EmbedByTypeResponseEmbeddings'>

@sjyangkevin this is happening because for airflow 3, we use from airflow.serialization.serde import serialize module to serialize the data that is being sent to xcoms. The case seems to be that the EmbedByTypeResponseEmbeddings is a pydantic model which our serialize library doesn't know how to handle.

Verify the issue and the return type is changed in Cohere 1.4.2. Issue created https://github.com/apache/airflow/issues/50867