crystal-lang / shards

Dependency manager for the Crystal language
Other
766 stars 102 forks source link

Monorepos: multiple shards per repository #635

Open ysbaddaden opened 5 months ago

ysbaddaden commented 5 months ago

I'd like to start a discussion on monorepos. It is common for larger projects to maintain a number of shards or to split an overall framework into smaller pieces that can work individually as well as they can work together.

There are workarounds (example) that accommodate the one shard per repo assumption of Shards by leveraging git submodules or subtrees, but they still need a repo for each shard and still need to synchronize from the main repo to each individual one.

Maybe shards could help?

ysbaddaden commented 5 months ago

For the installation side, maybe Shards could have a root: sub/folder argument to use as the root of a shard? The default would be to use the root of a shard. For example:

name: framework

dependencies:
  framework-router:
    github: some/framework
    root: shards/router

  framework-views:
    github: some/framework
    root: shards/views

That doesn't help with local and inter-shards dependencies, that would need to have shard.override.yml?

Edit: as stated below, an unresolved issue is that the version is detected by shards as a git tag, so all individual shards in the monorepo must have the same version number!

ysbaddaden commented 5 months ago

Another workaround is to have a single shard with different modules under src, treat them as if they were distinct shards (i.e. beware of requires), then have users only require the modules they need.

Then:

require "framework/router"
require "framework/views"

Pros: No need for special cross shards dependencies, pull a single repository, no need to fix local dependencies, a single shard.yml to handle for releases.

Cons: you need to pull all the dependencies for the whole framework, which may be annoying & pointless if you only need a module that doesn't need them (damn). A solution is to limit external dependencies (or to not have any) :grin:

Edit: another drawback as stated below is that all modules will have the same version and will evolve or get a new version number despite having no changes.

straight-shoota commented 5 months ago

I understand an important reason for multiple shards is that they can evolve independently, with their own dependencies and releases. The monorepo development ensures individual components stay in sync, but they can move at different speeds. If components are supposed to be usable independently, their versions should not be interlocked.

So the biggest challenge is probably version discovery. Shards pulls versions out of git tags, and those are per repository. This won't work for multiple subprojects with independent versions within one repository.

Namespaced version tags could be an option for that (tags foo/bar:v1.1.0, foo/baz:v1.0.1). I haven't put much thought about this, but it feels messy.

straight-shoota commented 5 months ago

I've had another idea in my head for a while now, which could potentially provide a solution for version discovery in a monorepo. But its main purpose is unrelated, so I'll have to write it down in a separate issue to keep this one focused.

Blacksmoke16 commented 5 months ago

I can speak a bit on my experience with this.

but they still need a repo for each shard and still need to synchronize from the main repo to each individual one.

I setup Terraform to manage the Athena GH org, so creating/configuring a new component repo is as simple as adding another entry to https://github.com/athena-framework/infrastructure/blob/master/terraform/github/components.tf, which uses a template repo as the source. So after applying the change, just have to go in and rename the types/files and update README and such.

Also added some tooling into the monorepo itself, so I can run a command and have it subtree a repo, then just have to add it to the main sync script and that's it. Takes some time to get it setup initially, but after that not too bad adding new components and such.

I understand an important reason for multiple shards is that they can evolve independently

Correct, this is why I went with the many-repo approach vs a monorepo where the user just has to do like require "framework/router" or require "framework/shard". As I wanted each component to be versioned on its own, which needs its own repo anyway to tag releases and such.

Symfony on the other hand does things a bit differently by tagging all components with the same tag, skipping components that had no changes since the latest tag. Composer has https://getcomposer.org/doc/04-schema.md#replace, which allows the monorepo itself to be used as a dependency that satisfies all the specific components.

Kinda unrelated to this issue exactly, but the only thing I can think of now that would be helpful for monorepos/multiple shard projects would be if shards had something like https://getcomposer.org/doc/04-schema.md#suggest, where you could have a component suggest using other components for additional functionality in a more robust manner.