bazelbuild / rules_python

Bazel Python Rules
https://rules-python.readthedocs.io
Apache License 2.0
531 stars 541 forks source link

[gazelle] Support relative python imports #2203

Open dougthor42 opened 2 months ago

dougthor42 commented 2 months ago

🚀 feature request

Relevant Rules

Description

Sometimes a python code base uses relative imports:

from .baz import Baz
from .same_package_other_module import cat_in_the_hat
from ..one_level_up import thing1
from ...two_levels_up import thing2
from .. import moduleA

While I don't necessarily agree with this style, it's something that Python supports (see PEP 328) and thus I believe that Gazelle should support it too.

Right now, Gazelle simply ignores these imports. Here's an example:

example/
    __init__.py
    a.py
    b.py
    BUILD.bazel
# example/a.py
from . import b
del b
# example/b.py
pass
# example/BUILD.bazel
# as generated by Gazelle with "file" generation mode
load("@rules_python//python:defs.bzl", "py_library")

py_library(
    name = "a",
    srcs = ["a.py"],
    visibility = ["//:__subpackages__"],
)

py_library(
    name = "b",
    srcs = ["b.py"],
    visibility = ["//:__subpackages__"],
)

If we change a.py to use an absolute import:

# example/a.py
from example import b
del b

Gazelle will add the dep:

# example/BUILD.bazel
# as generated by Gazelle with "file" generation mode
load("@rules_python//python:defs.bzl", "py_library")

py_library(
    name = "a",
    srcs = ["a.py"],
    visibility = ["//:__subpackages__"],
    deps = [":b"],
)

...

Describe the solution you'd like

I'm somewhat familiar with the Gazelle code base. I believe it will be pretty easy (for the most part - see Issues below) below) to generate the full target path for the current file being processed, and then adjust the target based on how many dots . are in the relative import.

Because of the Issues, I'd imagine that this would be an experimental, opt-in feature for a while. It would be guarded by a directive:

# gazelle:experimental_allow_relative_imports true

Here are some examples of the logic, assuming a slightly more complex dir structure:

toplevel/
    __init__.py
    MODULE.bazel
    BUILD.bazel
    foo/
        __init__.py
        BUILD.bazel/
        bar.py
    baz.py
    example/
        __init__.py
        a.py
        b.py
        BUILD.bazel
# relative import of another module "b.py" in same package "example"
# from . import b
read "from . import b" in file "example/a.py"
what is our current target? "//example:a"
We have 1 dot, so trim off 1 component ":a". target_stem = "//example"
Add the import "b". dep target name = "//example:b"
# relative import of a different packages's module
# from ..foo import bar
read "from ..foo import bar" in file "example/a.py"
current target? "//example:a"
We have 2 dots, so trim off 2 components "example:a". target_stem = "//"
Add the name from the dots and from the import. dep_target_name = "//foo:bar
# relative import of the parent packages's module "baz"
# from .. import baz
read "from .. import baz" in file "example/a.py"
current target? "//example:a"
We have 2 dots, so trim off 2 components "example:a". target_stem = "//"
There's no name after the dots, so the thing we're importing is a module (probably).
dep_target_name = "//:baz"

The above examples assume file-level generation. More thought will be needed to support package-level generation.

Issues

It's difficult to know if from ..foo import bar should be //:foo or //foo:bar. Is bar a class/function? Or is bar another module? In the example above, it's a python module, but that's not always the case.

In the from .. import baz case, it's possible that baz is an identifier defined in the parent package's __init__.py, and thus the dep target would be //:__init__.

Describe alternatives you've considered

I tried convincing the developers to use absolute imports, but they just weren't having it :rofl:

So I also tried using the "package" generation mode, but the issue was that unit test files also used relative imports and would not include the dep.

The current workaround is to add # gazelle:include_dep //example:b annotations, but those are prone to diverging from the actual code and can be tedious to write for large projects.

aignas commented 2 months ago

Yeah, the issues you've mentioned is a real pain. That said, I think gazelle should be able to parse the python files for symbols and have an internal mapping somewhere of what symbols are in which targets.

To have something that works and is still experimental, you could implement it in stages:

  1. Allow using relative imports in the BUILD.bazel file in the gazelle: resolve directive - this may be a good way to test things manually, or have an escape hatch when the auto-detection does not work.
  2. Implement the relative import handling without handling the __init__ targets.
  3. If you went with the package generation mode, at least you would not need to handle the __init__ targets and then the semantics could be easier as you might get away without supporting the __init__ like targets.
  4. Somehow attempt to first parse the python files within the current workspace to understand what symbols are there - you only need to get the def and class and global variables. How hard could that be... famous last words...

What other things would we need to have around this issue?