bazelbuild / rules_python

Bazel Python Rules
https://rules-python.readthedocs.io
Apache License 2.0
523 stars 539 forks source link

py_common API #1647

Open UebelAndre opened 2 years ago

UebelAndre commented 2 years ago

Description of the feature request:

Similar to cc_common is there a py_common available?

What underlying problem are you trying to solve with this feature?

I would like to have some rules which generate additional actions but would otherwise be interchangeable with a py_library. I've done similar things before using the cc_common API but cannot find one for python.

Which operating system are you running Bazel on?

Any

What is the output of bazel info release?

release 5.1.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

comius commented 2 years ago

cc @rickeylev

py_common is not available at the moment. We're doing rewrite on Python rules to Starlark, starting with the internal ones and then the ones native in Bazel. Implementing py_common now would interfere with the process, however I see no problem distilling it from Starlark implementation when it's finished.

As such I marked this as P4, but will upgrade it to P2 once Starlarkification is done.

UebelAndre commented 2 years ago

@comius is there a tracking issue for porting the python rules to starlark?

UebelAndre commented 2 years ago

@comius @rickeylev friendly ping. Is there a tracking issue for the port to starlark?

rickeylev commented 2 years ago

Sorry, didn't see this.

No, I don't think there's a GitHub issue tracking it, yet. I'd create one now but am on vacation for a bit. Feel free to create one and assign it to me, though.

Could you tell more about the problem you're trying to solve, and features you want? Internally, we have 3-5+ cases that could be described as "create a drop in replacement for py library", but each has some small differences that need different "plugin" points. So I'm curious to know which part of rule behavior your logic needs to intercept/customize/keep/drop

On Mon, Jul 4, 2022, 10:21 AM UebelAndre @.***> wrote:

@comius https://github.com/comius @rickeylev https://github.com/rickeylev friendly ping. Is there a tracking issue for the port to starlark?

— Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_python/issues/1647, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIEXQ6SZVUZOCCUFVSAFYLTVSMMSFANCNFSM5WSSU56Q . You are receiving this because you were mentioned.Message ID: @.***>

UebelAndre commented 2 years ago

Could you tell more about the problem you're trying to solve, and features you want? Internally, we have 3-5+ cases that could be described as "create a drop in replacement for py library", but each has some small differences that need different "plugin" points. So I'm curious to know which part of rule behavior your logic needs to intercept/customize/keep/drop

I think the case that would be most impactful to me is being able to take my cc_binary-like rule and update it to provide an interface to python targets I have without needing to make a tree of macros as some of the attributes are hard to manage when select is used for some attributes. I'd come across the desire to do this a couple of times before writing this issue. In most cases I have a web of macros that could easily be condensed to a single rule.

@rickeylev would you be able to enumerate the cases you're aware of?

rickeylev commented 2 years ago

fyi, i created https://github.com/bazelbuild/bazel/issues/15897 as a tracking issue for the starlark rewrite

rickeylev commented 1 year ago

A quick brain dump of some known points of customization:

rickeylev commented 1 year ago

From Phillip Shrader over slack, the sort of customization they do at Aurora:

Our current approach is: bazel-generated wrapper calls into a shell script we wrote that sets up some hermeticity things and eventually calls path/to/python3 -SsB path/to/hermetic_python3.py actual/file/of/interest.py . And then path/to/hermetic_python3.py does a few things like clearing sys.path[0] and a few other hermeticity things and then eventually delegates to actual/file/of/interest.py . So another potential interest of mine is to maybe combine the bazel generated wrapper with our shell script to reduce the number of steps involved in setting up all the hermetic stuff.

github-actions[bot] commented 9 months ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

UebelAndre commented 9 months ago

Cc @rickeylev this is still desired but maybe the issue can be moved to rules_puthon? Now that Bazel 7 is out and we can use the starlark rules there?

UebelAndre commented 5 months ago

@rickeylev do you know if this would be something that's easy to do? I still find myself wanting to make custom rules that build python binaries. Having an interface for that would be fantastic

UebelAndre commented 2 months ago

@rickeylev would it be acceptable to have an experimental flag that gates the use of some py_common struct that would contain a single function, py_executable_base_impl, for now?

https://github.com/bazelbuild/rules_python/blob/084b877c98b580839ceab2b071b02fc6768f3de6/python/private/common/py_executable.bzl#L140-L156

This would cover my primary use for py_common which is simply writing Bazel rules that equate to py_binary or py_test.

The only changes to py_executable_base_impl (or whatever would be a part of the interface) is that ctx.attrs and ctx.files are pulled in as a separate parameters so other rules have the chance to address missing fields without actually needing to add attributes to the rule. This approach would have some splash damage where access directly from ctx would need to be updated but the changes should be otherwise pretty straight forward.

UebelAndre commented 2 months ago

@aignas do you have any thoughts here?

arrdem commented 2 months ago

This is to some degree a dupe of my https://github.com/bazelbuild/rules_python/issues/1860; interested in whatever the outcome here is.

UebelAndre commented 1 month ago

I would be willing to help drive some MVP but I'm concerned about potentially wasting time without prior design sign-off from the maintainers.

aignas commented 1 month ago

Playaing devil's advocate here - why would we have to support this API when bazel is working on subrule support? ~Is it not possible to use the py_binary and friends together with the subrule?~

Aside from that, I am personally not super familiar with that part of the code, so cannot comment on implications or the maintenance burden in the long run. The request in general looks reasonable, but I would love to not expose anything that is not needed and the semantics arg should not be exposed, IMHO.

IMHO using subrule as described in https://docs.google.com/document/d/1RbNC88QieKvBEwir7iV5zZU08AaMlOzxhVkPnmKDedQ/edit#heading=h.9v5sndhcy9u and decomposing the py_binary into parts would be ideal.

UebelAndre commented 1 month ago

Playaing devil's advocate here - why would we have to support this API when bazel is working on subrule support? ~Is it not possible to use the py_binary and friends together with the subrule?~

If subrule (which I've not heard about until just now) exists then I can probably do what I want which is to write a rule which creates a py_binary or py_test and there wouldn't be a need for this interface. That being said though, what is time timeline for subrule? Adding py_common would be ideal if there was an unbounded timeline. This is behavior that I and others I know have wanted for a while and the lack of it had lead to very clunky/error-prone macros that ought to be rules.

Aside from that, I am personally not super familiar with that part of the code, so cannot comment on implications or the maintenance burden in the long run. The request in general looks reasonable, but I would love to not expose anything that is not needed and the semantics arg should not be exposed, IMHO.

Do you know who would have this context and could you ping them on this issue? Again, I ultimately don't care what form the implementation takes as long as the result is some publicly accessible API for creating python binaries in custom rules (and is available within the span of a month or two).

rickeylev commented 1 month ago

I'm +1 on creating some sort of py_common API. It might be easier to discuss the API of it given some more concrete use cases.

subrule timeline

I think it's supposed to be part of Bazel 8? Not sure on it wrt Bazel 7.

exposing semantics

Semantics shouldn't be exposed. It's purely a mechanism to make it somewhat easier for Google to internally patch the code. It's also not a very stable API.

The def py_executable_base_impl(ctx, *, semantics, is_test, inherited_environment = []): signature

This signature is pretty internal, so exposing it as-is would be a bad idea. Exposing something without the semantics, is_test, and inherited_environment args seems pretty reasonable, though. For reference:

re: API considerations:

The main consideration I have is: I don't want otherwise internal refactorings or non-breaking feature additions to be constrained because of a lower-level of API access. Some examples being:

What this loosely translates to is: exposing py_executable_base_impl isn't sufficient. What has to be exposed is something like create_executable_rule(), so that all the args passed to rule() can be properly setup.

If there aren't any modifications of the existing attributes/rule-args desired, then exposing something like create_executable_rule() should be mostly straight forward. Maybe the implementation function you pass is called with (ctx, struct_of_stuff) ?

If there are modifications to the existing rule-args, then, well, that might be a bit more messy, but I'm thinking a builder-pattern esq dict-of-args might work OK.

If you look around python/private/common, you can see a broke out various things in an attempt to be composable, and even the tests in tests/base_rules are somewhat setup in a "conformance test" type of way. But, I never got around to refactors to replace semantics and implement pytype type checking rules using a py_common type of approach.

UebelAndre commented 1 month ago

I would be happy to try to factor out an interface so consumers of rules_python can also create executables. To provide a concrete example I've created https://github.com/UebelAndre/rules_pytest which has been a tricky set of rules to get right due to the lack of a py_common interface that lets me create a python executable. I'm hoping to do something like the following to convert the series of rules and macro into one single rule which does all the necessary configuration for pytest.

https://github.com/UebelAndre/rules_pytest/compare/main...py_common

The mock interface I was playing with looks like

py_exec_info = py_common.create_executable(
        ctx = ctx,
        toolchain = py_toolchain,
        main = ctx.file._main,
        legacy_create_init = False,
        deps = [pytest_toolchain.pytest] + ctx.attr.deps,
        is_test = True,
    )

where the function signature might look something like

def py_common_create_executable(
    ctx,
    toolchain,
    main,
    legacy_create_init,
    deps,
    is_test,
):
    """Create a python executable

    Args:
        ctx (ctx): The current rule's context object
        toolchain (py_toolchain): The python toolchain
        main (File): The source file to use as the entry point.
        legacy_create_init (bool): Whether to implicitly create empty `__init__.py` files/
        deps (list): A list of python dependencies.
        is_test (bool): Indicates if the executable is a test executable.

    Returns:
        struct: A struct containiging:
            - executable: The executable created.
            - runfiles: All associated runfiles for the executable.
    """

Would you say this is in the ballpark of possibilities? What changes would you make? I think if we can figure out the interface then the internals can change quickly and without impacting users.

rickeylev commented 1 week ago

Small update in this area:

We have a design for exposing analysis-time APIs. #2252 is the first installment of this to add a merge_py_infos() function.

The loading-phase side of things isn't addressed yet, though. i.e. how to define a rule with necessary attributes (toolchains, fragments, implicit attributes, exec groups -- all the stuff that rule() needs in order for later ctx.whatever to work). Some of this is mitigated by the analysis-time API design I mention above, but not all.

The basic idea I'm thinking for this is a function that returns a mutable dict of the args that would be passed to rule(). This allows the caller to modify them any way they please, while allowing them to automatically get any "defaults" that a py rule requires. (This idea is inspired by how Bazel internally defined rules in Java: a base Builder class would define the rule, then sub-classes would call various override/add/remove functions to modify it before building it into a rule definition.)

So, something like:

load(":pyrule.bzl", "pyrule")

def create_my_py_library():
  kwargs = pyrule.library_rule_kwargs()
  kwargs["attrs"]["srcs"]["aspects"].append(my_aspect)
  kwargs["toolchains"].append("//extra:toolchain")
  kwargs["attrs"]["deps"]["default"] = "//some:target"
  kwargs["provides"].append(MyInfo)
  kwargs["cfg"] = wrap_with_my_transition(kwargs.get("cfg"))
  kwargs["attrs"]["imports"]["mandatory"] = True
  kwargs["exec_groups"]["py_precompile"]["exec_compatible_with"] = ["@platforms//:linux"]

  # convert the mutable attr dicts to attr objects, etc
  kwargs = py_rule.finalize(kwargs)
  return rule(
    implementation = my_impl,
    **py_rule.finalize(kwargs)
  )

my_py_library = create_my_py_library()