Open knz opened 3 years ago
We should have two separate vendoring trees: one for tools, and one for CockroachDB.
This doesn't fully resolve the first "problem to solve", right? Two different tools that we use for the build could each depend on different, incompatible versions of the same package.
It's unclear (to me) how to achieve something similar with Bazel - someone with more experience should extend the issue with more details.
We currently use Gazelle for this purpose, and we would continue doing so. We currently have a single go.mod
file that Gazelle translates to the DEPS.bzl
file -- presumably, with the new model we would have two go.mod
files (or equivalent) that we'll have Gazelle translate into two different bzl
macros. Then we'll just make sure that everything is wired up appropriately so all the build tools only point to build dependencies, and vice-versea for the runtime dependencies.
Typically Gazelle translates importpath
s into repository names (like github.com/bufbuild/buf
=> @com_github_bufbuild_buf
) and those repository names are what's used in the BUILD
files. The main problem would be that you have collisions between the two sets of dependencies -- we could distinguish between them with a prefix or some other such hack (e.g., @crdbbuild_com_github_bufbuild_buf
) and we can validate the change by making sure that all the build-time stuff only depends recursively on repositories named @crdbbuild_*
. I don't think Gazelle supports this use case, but it would probably be reasonably easy to hack in, either as an upstream patch or some other hack.
This doesn't fully resolve the first "problem to solve", right? Two different tools that we use for the build could each depend on different, incompatible versions of the same package.
This is true - you really need something separate for every tool.
For Golang tools, I'd highly recommend the "tmpdir" hack. For reference, we do it extensively in https://github.com/bufbuild/buf/tree/master/make/go (see the dep_.*
files, most of them are Golang tools).
This doesn't fully resolve the first "problem to solve", right? Two different tools that we use for the build could each depend on different, incompatible versions of the same package.
Yes you're absolutely right. I was looking at this from the perspective of crdb but the problem would remain for any pairwise combination of tools.
That said, separate vendoring would also largely trim down the size of the cockroach-gen repo, which is why I thought of it first. Maybe we can do both, separate vendoring for tools and also sub-vendoring for different tools.
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
tldr Currently we mistakenly intermingle the build-time tool dependencies and the run-time dependencies inside the same
vendor
tree. We need to split them.Problems to solve
This causes the following problems:
if a build-time tool announces a dep compatibility at version X, and CockroachDB needs version Y with Y>X, we are mistakenly forcing the build-time tool to use version Y which may change its behavior.
This is the problem identified by the @bufbuild team in this comment.
if CockroachDB announces a run-time dep at version X, and a build-time tool announces a dep at version Y with Y>X, we are mistakenly letting the tool force our hand on the run-time dep.
This is the problem that caused massive wreckage of the master branch two weeks ago ( #67245 #67255 #67248 #67246 #67237): a tool was requiring a broken version of gRPC and we mistakenly bumped the gRPC version in CockroachDB when we upgraded the tool.
when a build-time tool is updated and this forces some dependency upgrades, the dep upgrade does not receive the same level of scrutiny (e.g. from the security team) as when we update a dependency for CockroachDB's benefit. Even though they happen to be the same currently. This is a security risk.
our security compliance roadmap requires us to clearly identify build-time and run-time dependencies separately. Currently this is difficult to do.
Urgency
The following pressure points exist to make us want to fix this soon-ish:
we want to update
buf
to improve our support for protobuf in CockroachDB. Currently further upgrades ofbuf
are impossible due to the runtime-incompatible gRPC dependency.more and more prospective customers require compliance reports that identify run-time and build-time dependencies separately.
How to go about it
We should have two separate vendoring trees: one for tools, and one for CockroachDB.
In the current system with a
vendoring
repo, we'd want to operate as follows:cockroach-tools
repo with its own vendor sub-treecockroach
repo depend on thecockroach-tools
repo as a sub-modulecockroach
, recurse into thecockroach-tools
tree to build all the tool deps prior to starting the build insidecockroach
.It's unclear (to me) how to achieve something similar with Bazel - someone with more experience should extend the issue with more details.
Epic CRDB-10447
Jira issue: CRDB-8559