tensorflow / addons

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons
Apache License 2.0
1.69k stars 611 forks source link

[WIP] Build against tf2.14 bazel strategy #2855

Closed seanpmorgan closed 1 year ago

seanpmorgan commented 1 year ago

Description

bhack commented 1 year ago

We have the same issue when I've introduced this in the release PR. It seems that pytest is not installed in the same python env where we are executing tests with bazel.

trevor-m commented 1 year ago

Fixes #2849

seanpmorgan commented 1 year ago

@angerson we (and other SIGs / downstream consumers) have had issues with the the hermetic python bazel rule. Can you or someone in SIG build take a look at this failing test when time allows:

  1. Our workspace install_deps call does not properly install the 3 requirements files that we have specified, since we error on ModuleNotFoundError: No module named 'pytest'
  2. We need to run a py_binary() for configure.py using the hermetic python after the install deps are available instead of using the system python install as is down in the test above.

We tried a few things on the last release... but we want to get ahead of this for upcoming releases when the new workspace strategy may be required (I see that we're still publishing multiple py docker images for future tf releases though).

georgiyekkert commented 1 year ago

@seanpmorgan Why do you pull tensorflow via http_archive and also via pip? You started doing it not so long ago - https://github.com/tensorflow/addons/commit/1c3a2b2b0c5ca902e056427ec943e326cf6b70c1 Looking at the code, you should install it only via pip.

seanpmorgan commented 1 year ago

@seanpmorgan Why do you pull tensorflow via http_archive and also via pip? You started doing it not so long ago - 1c3a2b2 Looking at the code, you should install it only via pip.

https://github.com/tensorflow/addons/blob/master/WORKSPACE#L18-L32

Because we load the 4 bazel workspace files that are provided since we compile custom CPU and CUDA ops. IIRC when on TF2.9 there was rev in CUDA version and our previous local toolchain broke. We started loading the workspace files from tensorflow to build the same way.

georgiyekkert commented 1 year ago

@seanpmorgan Why do you pull tensorflow via http_archive and also via pip? You started doing it not so long ago - 1c3a2b2 Looking at the code, you should install it only via pip.

https://github.com/tensorflow/addons/blob/master/WORKSPACE#L18-L32

Because we load the 4 bazel workspace files that are provided since we compile custom CPU and CUDA ops. IIRC when on TF2.9 there was rev in CUDA version and our previous local toolchain broke. We started loading the workspace files from tensorflow to build the same way.

There are 3 ways to fix it:

  1. Remove pulling tensorflow via http_archive - you are bringing a lot more than you need when you load all TF dependencies. Load only what you need. IMO this is the correct way
  2. Since EOL is planned in May 2024, we can patch tensorflow, so it keeps using non-hermetic Python. You can use the same patch as TF-federated - https://github.com/tensorflow/federated/commit/8580528acd9227ff1fe84c9ab95e6a483f2b5dfe#diff-4d440744ea1e145db1f807bb811e815b7e0526db8cc2dfdbba441ef80ceb9e81. This is the easiest way
  3. Use the hermetic python in your build. Currently you have errors like "ModuleNotFoundError: No module named 'pytest'" because you are not loading "pip" deps correctly, see https://rules-python.readthedocs.io/en/latest/pypi-dependencies.html#using-third-party-packages-as-dependencies. Each "pip" dependency must be specified explicitly in the BUILD file where needed. E.g. in tensorflow https://github.com/tensorflow/tensorflow/blob/a17ef94e5144fb0fe8447d7ae2e1b32f367a584c/tensorflow/python/distribute/failure_handling/BUILD#L50 https://github.com/tensorflow/tensorflow/blob/a17ef94e5144fb0fe8447d7ae2e1b32f367a584c/tensorflow/python/distribute/failure_handling/failure_handling_util.py#L19. Since you are pulling TF via both http_archive and pip, making configure.py work the same way as it is now might be tricky. Also https://github.com/bazelbuild/rules_python/blob/0.26.0/docs/pip.md#pip_parse requires a locked/compiled requirements file
bhack commented 1 year ago

@georgiyekkert Can you send a PR on this repo directly or over this PR's source branch? Thanks

seanpmorgan commented 1 year ago

https://github.com/tensorflow/addons/pull/2856