bazeltools / bazel-deps

Generate bazel dependencies for maven artifacts
MIT License
249 stars 121 forks source link

Dealing with version clashes on transitive dependencies #289

Open pbsf opened 4 years ago

pbsf commented 4 years ago

I work on a Monorepo that contains several Java, Scala and Kotlin projects, all within a single WORKSPACE. Using bazel-deps worked well for a while, but as the repository grew, it has become hard to deal with transitive dependencies version clashes.

To address that, I attempted to use a dependencies.yml for each project, but it didn't work out -- the order in which we imported each project/repository on the WORKSPACE was considered when resolving conflicts. Did I do something wrong, or is it working as intended?

Is it possible to have several versions of a single dependency on the dependencies.yml file, and force which version we want for each Bazel target? It looks like rules_jvm_external supports this.

Creating WORKSPACE files for each project is looking like the way to go, but we wanted to avoid that as much as possible.

ianoc-stripe commented 4 years ago

@pbsf generally in a monorepo you only want 1 version of any external dependency getting pulled in or you can have problems in the case of

DepA v1 -> Lib A -> Lib B \ --- Lib C DepA V2 -> Lib D -> Lib E /

Something like that. you need entirely separate trees and to enforce that forever if you are to mix versions. Otherwise conflicting external dependencies can just show up as runtime failures in anything that depends on C in this graph. So its intentional that this is expected to be loaded in as one tree currently.

(fwiw, I believe you could customize/isolate the tree's, making a custom bzl loader file by isolating the binds/target file behavior to match -- however I would definately not recommend this).

johnynek commented 4 years ago

yes, to second what Ian said, you might look at: https://github.com/johnynek/bazel_jar_jar which you can use to shade some jars you import if you want to have multiple jars.

This isn't very ergonomically integrated with bazel deps so you need to handle it on a case-by-case basis.

In my experience, you want to work very hard to get only a single version that works for everything in the monorepo or otherwise you will have weird runtime bugs if you don't have some static system to avoid duplicate items on the classpath.

johnynek commented 4 years ago

I don't see that they give you any tools to prevent two classpaths from having the same class files with rules_jvm_external, but maybe there is a general bazel solution now since in principle the same problem can exist just by people naming two classes the same thing in a single repo.

pbsf commented 4 years ago

Thanks for the input. The runtime issues you guys mentioned are the ones we are dealing with atm.

We already use bazel_jar_jar on a few cases. Might be the way to go indeed. The bigger problem is that our dependencies.yml has 80 conflicting dependencies, with versionConflictPolicy=highest. It is now too hard to fix these on a case-by-case basis. Most of these conflicts are not causing runtime errors.

We are thinking about creating a new dependencies.yml file with versionConflictPolicy=fail/fixed, load it earlier than the legacy one on the WORKSPACE, and fix the conflicts as it arises on the new file. Does that seem like a reasonable migration approach? Would you recommend fail or fixed as the conflict policy?

ianoc-stripe commented 4 years ago

Once you have gotten into the state its pretty hard to get out of. Our main internal control in the area's I care about is that we heavily restrict the PR's that can touch the workspace.bzl file, and i basically read the conflicts when it changes to decide if it seems ok or not. Having a version conflict of failed is probably pretty good for most things, some stuff like finagle, or guava can be a bit of a wild card even in this.

I would probably try the fail/fixed first to see how you can manage it, but overall i think using that or not this is going to boil down to probably visually deciding about transitive deps stability/bumping things + requesting new publishes or dropping the use of external libraries that aren't updated/need older versions of things. Its all pretty hassle filled in any big monorepo i've worked on, having the reviews or settings can mostly stop it getting into the repo, but the more correct you are in what goes into your repo can mean the more effort is required to do an upgrade (which is the only stable way to do the upgrade tbc, but its time consuming, e.g: Upgrade Lib A depends on newer C which then means you need to bump B, which has a knock on effect on libraries E,F,G... etc.. It does really pay to have a shallow dependency tree and I would shade something like spark where you can to use that to slim down the dependencies there)