apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.85k stars 4.25k forks source link

[Task]: Update the minor version of protobuf library in the upper bound prior to Beam release. #25590

Open AnandInguva opened 1 year ago

AnandInguva commented 1 year ago

What needs to happen?

If a Beam dependency has a flexible upper bound, users will download the most recent compatible version of a dependency at sdk installation time. Overtime, the version used at job submission may become newer than the version installed in a released Beam container. Given that forwards-compatiblity of a protobuf library is not guaranteed, the pipeline may fail.

To mitigate, protobuf library should be specified in install_requires with tight upper bound limiting to last recently released minor version. But if we depend on an old version of a library, it will cause inconveniences users,. Therefore, we should periodically update the upper bound we set, at least once per release cycle.

Issue Priority

Priority: 3 (nice-to-have improvement)

Issue Components

AnandInguva commented 1 year ago

@damccorm can you add the milestone 2.48.0 to catch this for the next release?

damccorm commented 1 year ago

The next release is 2.47.0 (the current one is 2.46.0)

tvalentyn commented 1 year ago

cc: @tvalentyn

damccorm commented 1 year ago

This is still up to date, moving to 2.48

Abacn commented 1 year ago

This is still up to date (4.23.3<4.24.0), moving to 2.50

lostluck commented 1 year ago

2.50 release manager here. This issue is currently tagged for the 2.50.0 release, which cuts in a week on August 9th.

Please complete work and get it into the main branch in that time, or move this issue to the 2.51 Milestone: https://github.com/apache/beam/milestone/15

riteshghorse commented 1 year ago

Already on latest version 4.23.4. Moving to 2.51.0 Milestone

damccorm commented 1 year ago

We're up to date as of now looking at https://pypi.org/project/protobuf/ and https://github.com/apache/beam/blob/master/sdks/python/setup.py#L300

This might change if protobuf gets a fix in for #28246 - but I'll at least move this blocker forward and we can address that in the separate issue

jrmccluskey commented 10 months ago

It looks like we're up to date with the upper bound at <4.26.0 and the latest release is 4.25.1.

lostluck commented 9 months ago

There's one week until the 2.54.0 cut and this issue is tagged for that release, if possible/necessary please complete the necessary work before then, or move this to the 2.55.0 Release Milestone.

Abacn commented 7 months ago

4.25.3 is the latest 4.x release as of 2.55.0 cut day. and it is <4.26.0. Next release needs to aware that the upcoming release bumped the major version, 4.25 -> 5.26, indicating breaking change

note: protobuf currently encode its version in minor version number for all language impls; and the major version is reserved for language specific breaking change

riteshghorse commented 7 months ago

+1. I'm working with @tvalentyn on workaround for that

damccorm commented 6 months ago

We still need to wait for google-api-core to release new version. The latest version of google-api-core has a dependency on protobuf<5 which would result in dependency conflict if we update the protobuf version for beam.

apache-beam[gcp,test] 2.55.0.dev0 depends on protobuf==5.26.0rc3
    google-api-core 2.17.1 depends on protobuf!=3.20.0, !=3.20.1, !=4.21.0, !=4.21.1, !=4.21.2, !=4.21.3, !=4.21.4, !=4.21.5, <5.0.0.dev0 and >=3.19.5
kennknowles commented 4 months ago

Is there any outstanding work on the protobuf major version bump?

liferoad commented 4 months ago

https://pypi.org/pypi/google-api-core/2.19.0/json

requires_dist": [
      "googleapis-common-protos<2.0.dev0,>=1.56.2",
      "protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0.dev0,>=3.19.5",
      "proto-plus<2.0.0dev,>=1.22.3",
      "google-auth<3.0.dev0,>=2.14.1",
      "requests<3.0.0.dev0,>=2.18.0",
      "grpcio<2.0dev,>=1.33.2; extra == \"grpc\"",
      "grpcio-status<2.0.dev0,>=1.33.2; extra == \"grpc\"",
      "grpcio<2.0dev,>=1.49.1; python_version >= \"3.11\" and extra == \"grpc\"",
      "grpcio-status<2.0.dev0,>=1.49.1; python_version >= \"3.11\" and extra == \"grpc\"",
      "grpcio-gcp<1.0.dev0,>=0.2.2; extra == \"grpcgcp\"",
      "grpcio-gcp<1.0.dev0,>=0.2.2; extra == \"grpcio-gcp\""
    ],

The new version requires protobuf < 5.0.0.dev0

liferoad commented 4 months ago

https://github.com/apache/beam/pull/30556 more work needed in the future.

jrmccluskey commented 3 months ago

https://pypi.org/pypi/google-api-core/2.19.1/json

Looks like the google-api-core package has been updated requiring < 6.0.0.dev0 for protobuf now. Do we want to prioritize this upgrade for 2.58.0, targeting the same upper bound?

jrmccluskey commented 3 months ago

Followed up, google-api-core 2.19.1 supports protobuf 5 but other GCP dependencies (specifically google-cloud-aiplatform) do not. The workaround for the breaking timestamp change introduced in 5.26.0 explored in #30556 can go in before we formally support protobuf 5.x, but will need a little bit more effort to test as a result. I'll be coming back to work on this after the 2.58 release

lostluck commented 2 months ago

Do we need to do anything further for the 2.59.0 release for this, or should I push it to 2.60.0 ?

damccorm commented 2 months ago

It is fine to push to 2.60

Abacn commented 1 month ago

protobuf 4.25.x still receives update, the latest one being 4.25.5 on Sept 18, 2024. Moving to 2.61.0