pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.33k stars 638 forks source link

"Visibility" support (akin to Bazel's) #13393

Closed joshua-cannon-techlabs closed 2 years ago

joshua-cannon-techlabs commented 3 years ago

Is your feature request related to a problem? Please describe.

In a monorepo you will like have N services (likely containing one-or-more underlying packages) as well as M "common" packages. Logically speaking, any piece of code should be able to dip it's hand into at most 2 top-level packages: it's own and the common code. Having code from service X depend on code from service Y would be a mistake and shouldn't be mechanically allowed.

Describe the solution you'd like Some way of specifying visibility in a BUILD file (with the default as "public" to maintain backwards-compatibility).

Initial implementation can be scoped to only supporting 2 values: "public" and "package".

For simplicity, if visibility was inherited from the nearest-declared-ancestor-dir that would limit the amount of visibility declarations to N BUILD files (one for each service, the M common packages can remain with the public default).

Additionally, this could be hidden behind a plugin to make it truly opt-in.

Describe alternatives you've considered A plugin which enforces this at test/lint/check-time instead of visibility declarations in BUILD files.

Additional context N/A

asherf commented 3 years ago

+1 for this.

yoav-orca commented 3 years ago

This would be a great value for our codebase well! It's something that's easy to introduce by mistake and wastes a lot of time when not handled correctly

tdimiduk commented 2 years ago

One thing that occurs to me would be to use something like a .gitignore model. Might as well re-use the syntax since there are already tools to parse it.

I think it would probably work fairly well to use a similar model as .gitignore that each file describes what a directory and it's subdirectories are allowed to depend on. There could be value to a .pants_allow or a .pants_disallow since in different cases allow or block lists would be more convenient. There might be a good way to allow both and just error if there are inconsistencies.

This would not be as fine grained as something in the build files, but could work pretty well without needing to define much new syntax

stuhood commented 2 years ago

This also relates to some of the proposals in #13621.

yoav-orca commented 2 years ago

I think this is a really important feature not only for governance but for stability. An example from today - we had an issue where a developer introduced a python dependency on another microservice in a code path that wasn't covered in CI. These services are not deployed together, which blew up of course. Visibility rules would have easily prevented this issue.

stuhood commented 2 years ago

Visibility rules would have easily prevented this issue.

So the idea would be that ~every target in every service would declare that it was only visible to that service?

Finding a way to do this without a tremendous amount of boilerplate will be a challenge. Maybe related to #13767.

benjyw commented 2 years ago

I think this is a really important feature not only for governance but for stability. An example from today - we had an issue where a developer introduced a python dependency on another microservice in a code path that wasn't covered in CI. These services are not deployed together, which blew up of course. Visibility rules would have easily prevented this issue.

Out of curiosity, why didn't dep inference pull in the dependency at the code level?

yoav-orca commented 2 years ago

We can't use pants to package and even if we did the service doesn't have the permissions that the other service has.

benjyw commented 2 years ago

We can't use pants to package and even if we did the service doesn't have the permissions that the other service has.

So how would this feature have helped? If Pants isn't doing the packaging then it can't enforce deps at packaging time. I guess you'd want Pants to check dependency hygiene when you run "lint" or something like that?

joshua-cannon-techlabs commented 2 years ago

Probably when tests were run, right? The test process would die on an import error, conceivably.

benjyw commented 2 years ago

Yeah that makes sense

joshua-cannon-techlabs commented 2 years ago

On second thought, it might be good to not have to get that far before seeing the error. There might be cases where the code isn't tested, but conceivably some pants operation would run on it where the dep resolution would error. So I presume the error should be raised in dep resolution, then it's on the user to ensure Pants tries to resolve the deps one way or another.

Eric-Arellano commented 2 years ago

So I presume the error should be raised in dep resolution, then it's on the user to ensure Pants tries to resolve the deps one way or another.

Agreed. I think this would happen when calculating dependencies. Which happens when doing things like check (MyPy), test, and package. In CI, we could also tell people to run ./pants dependencies :: I suppose to validate?

I doubt we want to "always no matter what" run the check. ./pants --version or ./pants --changed-since=HEAD fmt should avoid the cost of considering dependencies.

joshua-cannon-techlabs commented 2 years ago

Another data point, dependency inference is where the "unowned dependency error" is raised. https://github.com/pantsbuild/pants/blob/6ccc196937d60208a8dbb626d838938955598786/src/python/pants/backend/python/dependency_inference/rules.py#L211

paiforsyth commented 2 years ago

Here is a tool that implements something like this for Python projects https://import-linter.readthedocs.io/en/stable/index.html. Could be useful to consider the types of import contracts it supports.

benjyw commented 2 years ago

The contracts concept is useful!

thejcannon commented 2 years ago

This makes me think. How would people feel about this being a lint plugin :thinking:

(Thinking about it more, "unowned dependency error" could've been a lint plugin as well. It was never discussed)

stuhood commented 2 years ago

Hey folks: #15373 added a ValidateDependenciesRequest @union (which will be in 2.12.x), which can be used on a target-by-target basis to validate dependencies. Someone interested in visibility could likely use that @union and a plugin Field to experiment with an API. Unlike a lint plugin (which would also work, just at a different point in the build), the validation would be applied whenever dependencies are computed.

Eric-Arellano commented 2 years ago

the validation would be applied whenever dependencies are computed.

Meaning ./pants dependencies :: will always trigger it, and often ./pants lint :: and usually ./pants check test package :: will.

cognifloyd commented 2 years ago

visibility can be thought of as a dependees assertion.

What would you call a dependencies assertion? (restrict which modules can be imported by a target and its children targets) Using import-linter as inspiration, maybe that would be called a dependencies_contract?

Eric-Arellano commented 2 years ago

visibility can be thought of as a dependees assertion.

What would you call a dependencies assertion? (restrict which modules can be imported by a target and its children targets)

I think you're right that these are different things.

From what I understand, most have been asking for the visibility / dependees assertion. What is your motivation for the dependencies assertion?

cognifloyd commented 2 years ago

I have a list of 9 python modules (top-level modules I should say, there are actually quite a few other python modules in the monorepo) that are closely related. One of them is a client module (an api client library) that gets distributed via pypi for 3rd party use, and it can also be used by anything in the code base. But, it must be very lightweight in terms of dependencies, and it must not import any of the other modules (the server-side code). So, I need a way to make sure that that module is a self-contained package (except for a few 3rd party packages).

With visibility I would have to go backwards and make sure that every other module in the monorepo is <private> with exceptions for various other modules, but never the client lib. Essentially, that is spreading the boilerplate all over the place such that someone is very likely to not include it everywhere. So, a dependencies focused assertion would be more useful in this case than a dependees assertion like visibility.

kaos commented 2 years ago

Actually, I think the __defaults__ PR could be for you here @cognifloyd #15836 (if you have multiple targets that all should have this limited view of your repo, being lightweight)

Example:

# src/client/lightweight/BUILD

# Explicitly exclude any dependencies from these sources..
__defaults__(all=dict(dependencies=["!src/lib::", "!src/common/stuff::", ...]))

# The normal target stuff...

If it is just one target, then that dependencies ejection could simply be added directly to that target.

I think this ought to work.

But then again, we may need to improve any error handling here, when the inference may add a dependency that have been explicitly excluded, we should say so rather than just let things fail later. (If that's the case, I've not tried this out)

benjyw commented 2 years ago

I think different repos are going to want very different ways of expression allowed/disallowed dependencies. Some will be on the dependency side, some on the dependee side, some will allowlist, some will blocklist, etc.

So I think it's best to implement this at two levels:

1) A base API that exposes a very simple "is this dep allowed" pluggable rule and enforces it.

2) A higher-level feature built on top of 1) , in which we make some opinionated choices about how to decide whether a dep is allowed. E.g., this is the level where we say that "visibility" is the mechanism.

That way anyone with custom needs that don't align with the choices and restrictions imposed by 2) can drop down to 1), write their own in-repo rule and plug it in.

kaos commented 2 years ago

A base API that exposes a very simple "is this dep allowed" pluggable rule and enforces it.

For the base API on the dependencies side there is the ValidateDependenciesRequest: https://github.com/pantsbuild/pants/blob/04cc7b4bb31066f5020b5d2fe9ff623ddc63dd2d/src/python/pants/engine/internals/graph.py#L1196-L1205

which is the enabler for the visibility implementation taking shape in #15803, which then would be bullet 2 above.

For the dependee side I think that using dep excludes !!some:tgt may go a long way, esp if we support any specs and not just literal specs for those, paired with __defaults__ you are pretty close to the inversed direction of the visibility PR. (and possibly some additional checks to see if inferred deps have been explicitly excluded and log accordingly)

# visibility
__defaults__(all=dict(visibility=["src/lib/a::", "src/lib/b::"]))

# vs dep excludes
__defaults__(all=dict(dependencies=["!!src/common::"]))

N.b. I may have mixed up the dependee vs dependencies side above, it's not super clear to me which term refers to which side.

Also, the visibility check is implemented slightly backwards, as it goes from a target, and then checks if it is allowed according to each dependency's visibility rules.

benjyw commented 2 years ago

Got it. so ValidateDependenciesRequest is pluggable? Seems very sensible to me.

kaos commented 2 years ago

so ValidateDependenciesRequest is pluggable?

Yah, correct: https://github.com/pantsbuild/pants/blob/04cc7b4bb31066f5020b5d2fe9ff623ddc63dd2d/src/python/pants/engine/target.py#L2612-L2628

kaos commented 2 years ago

Oh, I got to make a shout out to ./pants help here! :D

$ ./pants help ValidateDependenciesRequest

`pants.engine.target.ValidateDependenciesRequest` api type
----------------------------------------------------------

A request to validate dependencies after they have been computed.

An implementing rule should raise an exception if dependencies are invalid.

activated by : pants.engine.target
union members: ExplorerValidateDependenciesRequest
               PythonValidateDependenciesRequest

Include API types and rules dependency information by running `./pants help-advanced pants.engine.target.ValidateDependenciesRequest`.
benjyw commented 2 years ago

Absolutely epic work from @kaos here! We've finally closed one of our most requested issues!! Now just need docs and a blog post.