Composable builds! Import external builds

oyvindberg commented 1 year ago

One of the earliest ideas which surfaced around the concept of a light-weight json-like build was that builds are now composable, and you can combine them however you want

share templates within your organization

imports:
  myorg: https://myorg/shared-templates/stable-sha
projects:
  a:
    extends: myorg.mytemplate
  scripts:
    dependsOn: myorg.scripts

this way you can standardize on dependency versions and all the shenanigans bigger orgs tend to do. You could also share scripts this way.

This would be meant to check in permanently.

import external builds into your build

use case: mount multiple builds into your IDE so you can update them together seamlessly. This is incredibly useful for polyrepo programming.

Maybe something like this:

imports:
  circe: https://github.com/circe/circe.git
projects:
  a:
    dependsOn: circe.circe-core

This would depend on either circe having a bleep.yaml already, or that bleep runs the automatic import.

It's somewhat impractical to apply these changes to the build and ensure that they are not checked in, so I think it can be improved somewhat with

build variants based on build rewrites

With the proposed build variants idea you can have your "normal" build with normal dependencies, and a build rewrite which rewrites dependencies into imports. Then you can use that build variant to compile, mount in IDE (bleep setup-ide --build-variant inlined-deps jvm3) and so on.

Given this build file

build-variants:
  inlined-deps: scripts/InlineDeps
projects: 
  a:
    dependencies: io.circe::circe-core:1.4.0
    scala:
      version: 3.2.1
  scripts:
    dependencies: build.bleep::bleep-core:${BLEEP_VERSION}
    scala:
      version: 3.2.1

And this build rewrite (with hypothetical syntax):

package scripts
object InlineDeps extends bleep.InlineDepsRewrite {
  Map("io.circe" -> "https://github.com/circe/circe.git")
}

That build rewrite would then generate a build like this:

build-variants:
  inlined-deps: scripts/InlineDeps
imports:
  circe: https://github.com/circe/circe.git
projects: 
  a:
    dependsOn: circe.circe-core
    scala:
      version: 3.2.1
  scripts:
    dependencies: build.bleep::bleep-core:${BLEEP_VERSION}
    scala:
      version: 3.2.1

And suddenly you can have an amazing experience working on hundreds of git repos simultaneously by commenting in/out build inlines as needed. These build variants would also be convenient to check in.

Baccata commented 1 year ago

The premise is very interesting. Here are some links for previous art :

Working on polyrepos poses the problem of git workflow and release-trains : coding on several repos at once in a unified build is great DX, but it's equally important to facilitate the git (and release) workflow that goes with that, as solving the coding problem is only one side of the coin.

oyvindberg commented 1 year ago

Thanks for the pointers! I think deployment is so specific to each organization that it's out of scope for a build tool. it would be nice if we provided some building blocks so you can write a script using the information found in the bleep build, but that's likely it.

Reminds of an other idea I have floated with my colleagues, I'll do a write-up and link it shortly.

oyvindberg commented 1 year ago

@Baccata see #269

Baccata commented 1 year ago

we provided some building blocks so you can write a script using the information found in the bleep build, but that's likely it.

it may be enough. Anyhow, no point for me to make an argument in favour of some OOTB solution without providing a working and compelling POC 😄

benhutchison commented 2 months ago

I'd really like to see support for composable builds. Effectively, being able to import another Bleep build by relative path, and the projects, settings and config in that build become subprojects by reference.

This is the feature that would let me move on to Bleep. I tend to define lots of small modules, containing code that's shared between different projects. These modules don't really belong to one project as such, they have an identity/role of their own and are used/referenced by multiple. Of course, they can be treated just as binary dependencies, but it's really convenient to be able to make coordinated changes across several modules and have it all building live.

Im currently doing this with SBT and it works (just barely, it's rather slow). Intellij supports it quite well. Unfortunately, it breaks VSCode's assumption that a build lives entirely within a single folder tree.

oyvindberg commented 2 months ago

Yes, we need this in one form or another - it's too good of a feature to pass up.

First I think we should try to come up with primary use cases and workflow. For instance I'm wondering if composed builds should be "permanent" - in the sense that you check in cross-repo builds, and you need to check out the repositories in a given structure for instance.

I think what I always had in mind was that this would be an ad-hoc thing, were you setup a multi-repo bleep build for the purpose of doing sweeping changes.

I think implementing the core functionality in bleep shouldn't be that hard, but figuring out how to make a very good DX out of it likely is.

Feel free to dream up some scenarios here 👍

benhutchison commented 2 months ago

Yes, we need this in one form or another - it's too good of a feature to pass up.

Really pleased to hear you're keen on some form of composable builds 😁

First I think we should try to come up with primary use cases and workflow.

I'll start with a principle I think is worthy: the information about how to build a particular module of code should be stored close to the code.

Critical data in a build file is

name of the artifact to build
the code's dependencies
the set of "platforms" that it is compatible with (Scala 2/3, JS, native etc)
where the code is located (although that can be convention-based and only included when non-default values used)

All of that data seems to be amenable to living in modules and being composed together, and not necessarily in a top-level, global definition.

So I'd love it if Bleep projects could support dependencies on binary artifacts, or on relative paths to source modules that include Bleep build metadata. While it downloads the former, it builds the latter.

I expect the way I'd use it is to switch out binary dependencies for Bleep sub-projects/modules when those subprojects are under active dev. But depend on binaries once they have hardened / stabilised.

For instance I'm wondering if composed builds should be "permanent" - in the sense that you check in cross-repo builds, and you need to check out the repositories in a given structure for instance.

If paths to subprojects are relative, then it ought to be possible to check-in several distinct modules into one repo with paths resolving in-repo, or use a 1 repo == 1 module approach and eg lay them out as sibling directories. Developer can choose, both work.

I think what I always had in mind was that this would be an ad-hoc thing, were you setup a multi-repo bleep build for the purpose of doing sweeping changes.

One likely use case would be when working with existing open-source libraries. Start with a binary dep, but discover you need add something to the library to support your application. Switch to a project dependency while adding a PR to the library, so the changes in the app and library can be crafted together. Once the PR is merged and the lib published, switch back to binary dependency.

oyvindberg commented 2 months ago

Right, this use-case where you want to mount external dependencies in your build is what I primarily have in mind as well. It would basically be a mapping from maven groupId/organization to a git repository.

This would be setup either in the build file itself, or outside - I'm not really sure yet, and generally not checked in (though nothing would stop you)

Let's sketch one approach which sounds deceptively simple to implement:

bleep loads build file

build file contains something like this:

import-build:
io.circe: https://github.com/circe/circe.git

bleep clones circe repo to .bleep/external/circe
bleep verifies that there is a bleep.yaml file in the cloned repo. might run an sbt import otherwise
bleep loads circe build file, and likely does some structured renaming to avoid name conflicts
bleep transforms build file, so that all io.circe.xxx dependencies are replaced by a dependsOn the corresponding circe project
bleep uses coursier to resolve all dependencies
bleep does another transform to double check that we didn't inherit the relevant io.circe artifacts transitively from somewhere else
good to go.

there are likely some details, where cache invalidation, file watching and so on will need to be aware of this. crossIds are really (too) free form in bleep, so there would likely be problems if one build uses something non-standard

oyvindberg commented 2 months ago

let's say that the import-build structure also accepts relative paths (relative to build directory) instead of git repositories, so you're free to control these things yourself as well.

thoughts on this proposal @benhutchison ?

benhutchison commented 2 months ago

Hi @oyvindberg, I think we're close but there are probably some differences in thinking revealed by language that are worth noting..

To understand where I'm coming from, I have about 40 modules each of which are their own distinct SBT build. It started as one large project, which became unwieldy so I broke it into small pieces, initially in one mega sbt build. But when a second application needed a lot of the same code, I started sharing modules across them, and converted the modules to individual builds that can be referenced from both. And now in a third application is starting to import some of them.

where you want to mount external dependencies in your build

I tend to think of builds as being fundamental plural (every module has its own), but they do coalesce into the "build" of the thing being worked on, which pulls all of its dependency builds into it. If instead of opening my top-level app in my IDE, I open a module, it pulls its dependency builds in just the same way.

would basically be a mapping from maven groupId/organization to a git repository

A very-, perhaps most-, common mapping from an artifact to repos, however I feel it can be stripped back slightly to make fewer assumptions: it's just a directory containing a Bleep buildfile and sources. Often the directory sits at root of a git repo, but if it didn't (many in one git repo, or not checked into git yet) it should still work.

bleep clones circe repo to .bleep/external/circe

That doesn't quite align with my use case where I already have modules checked out into lots of dirs. Nonetheless seems a useful feature for the use case you have in mind.

bleep does another transform to double check that we didn't inherit the relevant io.circe artifacts transitively from somewhere else

So many of my modules have the same dependencies over and over. At the bottom are eg some modules that extend libraries like Cats that I tend to use everywhere. I hadn't really thought about it before, but on reflection SBT handles this quite gracefully. The dependee module wrapping Cats gets loaded just once AFAICT, even when multiple loaded module builds refer to it. Must be smart enough to realise all the paths resolve to the same thing.

It's all built on the project.dependsOn(ProjectRef(URI, moduleName)) in SBTs API, which allows a project to depend upon another project at a relative path.

let's say that the import-build structure also accepts relative paths (relative to build directory) instead of git repositories, so you're free to control these things yourself as well.

Yes relative paths is the key capability I'd need to migrate my "DAGgy community" of modules across from SBT.

Right now SBT takes over a minute to load my top-level build files, so all these nested builds come at a cost. And the number of SBT "keys" that it creates is excessive eg [info] resolving key references (51205 settings) .... It'd be nice to make that bleeping faster.

oyvindberg commented 2 months ago

Thanks for the feedback, I'll iterate and answer what you wrote a bit later

For now I just wanted to comment that that world would appear so much simpler with all those modules in a monorepo with one bleep build, and:

bleep setup-ide to mount just the projects you want to work on (including transitive, of course) in the IDE
269 (not implemented yet, but would be simple) to determine which applications need to be redeployed after a given (set of) commits.

We're thinking very much alike about this, but I'll have to process what you mean/want by all those builds

benhutchison commented 2 months ago

Writing on the train so briefly..

Putting all modules into one monorepo is similar to where I started. But it doesn't seem to scale gracefully. What is the organising principle?

Now some of the modules are shared across 3 different applications, others two, others just the original.

Parts I may use just as binary dependencies, and or publish into open source.

Are 40 modules ok in one monorepo? What about 60, 80 etc? When it's no longer ok, where do you go next?

Alternatively... Im finding this works well small, scales fairly gracefully (except that SBT creaks and groans) and just feels good:

Keep info about how to build a module within the modules' directory. Declare dependencies between modules, that can be either binary or source builds. Lay the modules out as seems best, knowing I can reorg them later without affecting build structure.

KristianAN commented 2 months ago

Re: Sharing templates. It would be very nice if it was possible to embed e.g. scalafix, scalafmt and similar configurations into the bleep template somehow. It's not entirely in the scope of the tool, but also would not be too hard to embed into the config

Something like

externalConfig: fileName: .scalafix.conf config: | multiline config goes here

then have bleep generate the file based of this.

add the actual config files to gitignore and bleep.yaml is the source of truth.

Should be simple enough, but the real question is scope.

oyvindberg / bleep