aspect-build / rules_py

More compatible Bazel rules for running Python tools and building Python projects
Apache License 2.0
73 stars 22 forks source link

[Bug]: sys.path weirdness: local files/dirs shadow third-party packages #175

Open georgevreilly opened 11 months ago

georgevreilly commented 11 months ago

What happened?

I'm migrating a large codebase from rules_python to rules_py. One problem that I've had to address multiple times is that directory names and filenames are now shadowing third-party packages.

When using rules_py, a number of tests fail with a callstack like this:

Traceback (most recent call last):
  File "/mnt/tmpfs/BAZEL_OUTPUT/sandbox/processwrapper-sandbox/33663/execroot/blah_blah/bazel-out/k8-fastbuild/bin/src/python/pytask/foo/bar/quux.runfiles/blah_blah/tools/build_rules/py/private/pytest_loader.py", line 6, in <module>
    import pytest
...
  File "/mnt/tmpfs/BAZEL_OUTPUT/sandbox/processwrapper-sandbox/33663/execroot/blah_blah/bazel-out/k8-fastbuild/bin/src/python/pytask/foo/bar/quux.runfiles/py_deps_main/attrs/attrs/attr/_make.py", line 5, in <module>
    import copy
  File "<frozen zipimport>", line 259, in load_module
  File "/mnt/tmpfs/BAZEL_OUTPUT/execroot/blah_blah/external/python_interpreter/lib/python38.zip/copy.py", line 60, in <module>
  File "//mnt/tmpfs/BAZEL_OUTPUT/sandbox/processwrapper-sandbox/33663/execroot/blah_blah/bazel-out/k8-fastbuild/bin/src/python/pytask/foo/bar/quux.runfiles/blah_blah/tasks/tasks/org.py", line 3, in <module>

If you look at the copy.py source, it's actually trying to do from org.python.core import PyStringMap. This is Jython support; a handled ImportError is raised in CPython. I worked around this by renaming org.py.

sys.path is substantially longer in rules_py. There are 1193 entries for the above test, versus 484 for rules_python. As far as I can tell, the primary difference is that every directory in our source tree that we import from seems to be in sys.path when using rules_py.

Version

Development (host) and target OS/architectures: Linux x86_64, Ubuntu 20.02 Output of bazel --version: bazel 6.2.0 Version of the Aspect rules, or other relevant rules from your WORKSPACE or MODULE.bazel file: 0.3.0 Language(s) and/or frameworks involved: Python, Scala, Java, ...

How to reproduce

No response

Any other information?

No response

alexeagle commented 10 months ago

Thanks for the feedback @georgevreilly ! We've finally found some funding to get started on a 1.0 release for rules_py and hope to reach that around BazelCon. Is your employer interested in sponsoring the work? (If not, excellent contributions like issue reports and verifying the fixes are much appreciated, of course)

alexeagle commented 10 months ago

I think this is a consequence of a fundamental design decision. rules_python puts the name of your repository on the sys.path and therefore your local top-level folders were namespaced by that, and didn't collide with third-parties. However, that was a Bazel-only convention, not how idiomatic Python is written. (it also causes problems with bzlmod where "the name of your repository" depends on the context where the name is accessed, see https://bazelbuild.slack.com/archives/C014RARENH0/p1695268845206649)

Most clients I've seen have a repeated folder structure for their top-level, so something like my-repo/my-repo/path/to/file so that import my-repo.path.to works correctly under Python outside of Bazel. I think you'll have to do a one-time refactor like this.

If you decide to do this, note that one option is to use git filter-repo to "relocate" your commits under that folder (see https://blog.aspect.dev/otherrepo-to-monorepo) so that you don't pollute the blame layer with a giant file move.