Closed joshua-cannon-techlabs closed 2 years ago
+1 for this.
This would be a great value for our codebase well! It's something that's easy to introduce by mistake and wastes a lot of time when not handled correctly
One thing that occurs to me would be to use something like a .gitignore
model. Might as well re-use the syntax since there are already tools to parse it.
I think it would probably work fairly well to use a similar model as .gitignore
that each file describes what a directory and it's subdirectories are allowed to depend on. There could be value to a .pants_allow
or a .pants_disallow
since in different cases allow or block lists would be more convenient. There might be a good way to allow both and just error if there are inconsistencies.
This would not be as fine grained as something in the build files, but could work pretty well without needing to define much new syntax
This also relates to some of the proposals in #13621.
I think this is a really important feature not only for governance but for stability. An example from today - we had an issue where a developer introduced a python dependency on another microservice in a code path that wasn't covered in CI. These services are not deployed together, which blew up of course. Visibility rules would have easily prevented this issue.
Visibility rules would have easily prevented this issue.
So the idea would be that ~every target in every service would declare that it was only visible to that service?
Finding a way to do this without a tremendous amount of boilerplate will be a challenge. Maybe related to #13767.
I think this is a really important feature not only for governance but for stability. An example from today - we had an issue where a developer introduced a python dependency on another microservice in a code path that wasn't covered in CI. These services are not deployed together, which blew up of course. Visibility rules would have easily prevented this issue.
Out of curiosity, why didn't dep inference pull in the dependency at the code level?
We can't use pants to package and even if we did the service doesn't have the permissions that the other service has.
We can't use pants to package and even if we did the service doesn't have the permissions that the other service has.
So how would this feature have helped? If Pants isn't doing the packaging then it can't enforce deps at packaging time. I guess you'd want Pants to check dependency hygiene when you run "lint" or something like that?
Probably when tests were run, right? The test process would die on an import error, conceivably.
Yeah that makes sense
On second thought, it might be good to not have to get that far before seeing the error. There might be cases where the code isn't tested, but conceivably some pants operation would run on it where the dep resolution would error. So I presume the error should be raised in dep resolution, then it's on the user to ensure Pants tries to resolve the deps one way or another.
So I presume the error should be raised in dep resolution, then it's on the user to ensure Pants tries to resolve the deps one way or another.
Agreed. I think this would happen when calculating dependencies
. Which happens when doing things like check
(MyPy), test
, and package
. In CI, we could also tell people to run ./pants dependencies ::
I suppose to validate?
I doubt we want to "always no matter what" run the check. ./pants --version
or ./pants --changed-since=HEAD fmt
should avoid the cost of considering dependencies.
Another data point, dependency inference is where the "unowned dependency error" is raised. https://github.com/pantsbuild/pants/blob/6ccc196937d60208a8dbb626d838938955598786/src/python/pants/backend/python/dependency_inference/rules.py#L211
Here is a tool that implements something like this for Python projects https://import-linter.readthedocs.io/en/stable/index.html. Could be useful to consider the types of import contracts it supports.
The contracts concept is useful!
This makes me think. How would people feel about this being a lint
plugin :thinking:
(Thinking about it more, "unowned dependency error" could've been a lint
plugin as well. It was never discussed)
Hey folks: #15373 added a ValidateDependenciesRequest
@union
(which will be in 2.12.x
), which can be used on a target-by-target basis to validate dependencies. Someone interested in visibility could likely use that @union
and a plugin Field
to experiment with an API. Unlike a lint
plugin (which would also work, just at a different point in the build), the validation would be applied whenever dependencies are computed.
the validation would be applied whenever dependencies are computed.
Meaning ./pants dependencies ::
will always trigger it, and often ./pants lint ::
and usually ./pants check test package ::
will.
visibility
can be thought of as a dependees
assertion.
What would you call a dependencies
assertion? (restrict which modules can be imported by a target and its children targets)
Using import-linter as inspiration, maybe that would be called a dependencies_contract
?
visibility can be thought of as a dependees assertion.
What would you call a dependencies assertion? (restrict which modules can be imported by a target and its children targets)
I think you're right that these are different things.
From what I understand, most have been asking for the visibility
/ dependees assertion. What is your motivation for the dependencies
assertion?
I have a list of 9 python modules (top-level modules I should say, there are actually quite a few other python modules in the monorepo) that are closely related. One of them is a client module (an api client library) that gets distributed via pypi for 3rd party use, and it can also be used by anything in the code base. But, it must be very lightweight in terms of dependencies, and it must not import any of the other modules (the server-side code). So, I need a way to make sure that that module is a self-contained package (except for a few 3rd party packages).
With visibility
I would have to go backwards and make sure that every other module in the monorepo is <private>
with exceptions for various other modules, but never the client lib. Essentially, that is spreading the boilerplate all over the place such that someone is very likely to not include it everywhere. So, a dependencies
focused assertion would be more useful in this case than a dependees
assertion like visibility
.
Actually, I think the __defaults__
PR could be for you here @cognifloyd #15836 (if you have multiple targets that all should have this limited view of your repo, being lightweight)
Example:
# src/client/lightweight/BUILD
# Explicitly exclude any dependencies from these sources..
__defaults__(all=dict(dependencies=["!src/lib::", "!src/common/stuff::", ...]))
# The normal target stuff...
If it is just one target, then that dependencies ejection could simply be added directly to that target.
I think this ought to work.
But then again, we may need to improve any error handling here, when the inference may add a dependency that have been explicitly excluded, we should say so rather than just let things fail later. (If that's the case, I've not tried this out)
I think different repos are going to want very different ways of expression allowed/disallowed dependencies. Some will be on the dependency side, some on the dependee side, some will allowlist, some will blocklist, etc.
So I think it's best to implement this at two levels:
1) A base API that exposes a very simple "is this dep allowed" pluggable rule and enforces it.
2) A higher-level feature built on top of 1) , in which we make some opinionated choices about how to decide whether a dep is allowed. E.g., this is the level where we say that "visibility" is the mechanism.
That way anyone with custom needs that don't align with the choices and restrictions imposed by 2) can drop down to 1), write their own in-repo rule and plug it in.
A base API that exposes a very simple "is this dep allowed" pluggable rule and enforces it.
For the base API on the dependencies
side there is the ValidateDependenciesRequest
:
https://github.com/pantsbuild/pants/blob/04cc7b4bb31066f5020b5d2fe9ff623ddc63dd2d/src/python/pants/engine/internals/graph.py#L1196-L1205
which is the enabler for the visibility implementation taking shape in #15803, which then would be bullet 2 above.
For the dependee
side I think that using dep excludes !!some:tgt
may go a long way, esp if we support any specs and not just literal specs for those, paired with __defaults__
you are pretty close to the inversed direction of the visibility PR. (and possibly some additional checks to see if inferred deps have been explicitly excluded and log accordingly)
# visibility
__defaults__(all=dict(visibility=["src/lib/a::", "src/lib/b::"]))
# vs dep excludes
__defaults__(all=dict(dependencies=["!!src/common::"]))
N.b. I may have mixed up the dependee
vs dependencies
side above, it's not super clear to me which term refers to which side.
Also, the visibility check is implemented slightly backwards, as it goes from a target, and then checks if it is allowed according to each dependency's visibility rules.
Got it. so ValidateDependenciesRequest is pluggable? Seems very sensible to me.
so
ValidateDependenciesRequest
is pluggable?
Oh, I got to make a shout out to ./pants help
here! :D
$ ./pants help ValidateDependenciesRequest
`pants.engine.target.ValidateDependenciesRequest` api type
----------------------------------------------------------
A request to validate dependencies after they have been computed.
An implementing rule should raise an exception if dependencies are invalid.
activated by : pants.engine.target
union members: ExplorerValidateDependenciesRequest
PythonValidateDependenciesRequest
Include API types and rules dependency information by running `./pants help-advanced pants.engine.target.ValidateDependenciesRequest`.
Absolutely epic work from @kaos here! We've finally closed one of our most requested issues!! Now just need docs and a blog post.
Is your feature request related to a problem? Please describe.
In a monorepo you will like have N services (likely containing one-or-more underlying packages) as well as M "common" packages. Logically speaking, any piece of code should be able to dip it's hand into at most 2 top-level packages: it's own and the common code. Having code from service X depend on code from service Y would be a mistake and shouldn't be mechanically allowed.
Describe the solution you'd like Some way of specifying visibility in a
BUILD
file (with the default as "public" to maintain backwards-compatibility).Initial implementation can be scoped to only supporting 2 values: "public" and "package".
For simplicity, if visibility was inherited from the nearest-declared-ancestor-dir that would limit the amount of visibility declarations to N
BUILD
files (one for each service, the M common packages can remain with the public default).Additionally, this could be hidden behind a plugin to make it truly opt-in.
Describe alternatives you've considered A plugin which enforces this at
test
/lint
/check
-time instead of visibility declarations inBUILD
files.Additional context N/A