RFC: Use Lerna with monorepo for core packages

justingreenberg commented 8 years ago

as the feathers ecosystem continues to evolve, it is becoming increasingly difficult for both users and maintainers to track dependencies, coordinate tandem releases, keep docs in sync, etc.

it is also difficult to at a glance to differentiate between core feathers and external plugins in the feathers organization

tl;dr: i recently used https://github.com/lerna/lerna (used by babel, react, angular, meteor etc) to migrate a large back office project with many custom npm modules into a "mono-repo", and it was incredibly simple. it automatically creates symlinks, i cannot express how much simpler maintenance is and how much time this has saved our developers....

feathers/       # https://github.comfeathersjsfeathers
    /scripts/    # manage releases, etc
    /packages/   # core feathers modules
        /feathers
        /feathers-client
        /feathers-commons
        /feathers-hooks
        /feathers-query-filters
        /feathers-rest
        /feathers-service-tests
        /feathers-socket-commons
        /feathers-authentication
        /feathers-authentication-client
        /feathers-authentication-jwt
        /feathers-authentication-local
        /feathers-authentication-oauth1
        /feathers-authentication-oauth2
        /feathers-authentication-permissions
        /feathers-authentication-popups
        /feathers-knex
        /feathers-localstorage
        /feathers-memory
        /feathers-mongodb
        /feathers-mongoose
        /feathers-nedb
        /feathers-sequelize
        /feathers-batch
        /feathers-bootstrap
        /feathers-configuration
        /feathers-hooks-common
        /feathers-socket-commons
        /feathers-swagger

or it could further be broken down by functionality...

feathers/       # https://github.comfeathersjsfeathers
    scripts/    # manage releases, etc
    packages/   # core feathers modules
        core/
            feathers
            feathers-client
            feathers-commons
            feathers-hooks
            feathers-query-filters
            feathers-rest
            feathers-service-tests
            feathers-socket-commons
        authentication/
            feathers-authentication
            feathers-authentication-client
            feathers-authentication-jwt
            feathers-authentication-local
            feathers-authentication-oauth1
            feathers-authentication-oauth2
            feathers-authentication-permissions
            feathers-authentication-popups
        storage-adapters/
            feathers-knex
            feathers-localstorage
            feathers-memory
            feathers-mongodb
            feathers-mongoose
            feathers-nedb
            feathers-sequelize
        plugins/
            feathers-batch
            feathers-bootstrap
            feathers-configuration
            feathers-hooks-common
            feathers-socket-commons
            feathers-swagger

https://github.com/lerna/lerna#about

let me know what you think :)

marshallswain commented 8 years ago

We've talked about doing it. It sure would help out with issue tracking. Maybe we'd be able to use GitHub projects in place of ZenHub. That would be nice for non-Chrome browsers and all mobile devices. The second organization you posted, with everything arranged by functionality, is really nice.

KidkArolis commented 8 years ago

@marshallswain consider https://github.com/blog/2272-introducing-projects-for-organizations

daffl commented 8 years ago

I suggested in https://github.com/feathersjs/feathers-hooks-common/issues/31 to split up all common hooks into their own repositories and test drive it.

I'm definitely open to it and it will probably make working on v3 easier (I think @ekryski can say a thing or two about the pain linking all those separate repos when working on the new auth). An important thing to consider is how this integrates with the tools we are using, e.g.

Code coverage and reporting
Travis CI (do you just run all tests all the time? A bonus would be that we could create some real integration tests)

Also, do we just delete the old repositories then? It would be nice not having so many in the org.

justingreenberg commented 8 years ago

Travis CI (do you just run all tests all the time? A bonus would be that we could create some real integration tests)

exactly, so package integration for ci is typically orchestrated as either bash scripts (in a task or scripts directory) or makefile in root... eg:

https://github.com/babel/babel/blob/master/scripts/test.sh https://github.com/facebookincubator/create-react-app/blob/master/tasks/e2e.sh

create-react-app uses an e2e.sh script to coordinate their CI flow which i think would an awesome solution for feathers...

alternatively, i suppose you could install lerna on CI server and use it to run npm scripts using https://github.com/lerna/lerna#run in project root, or scoped to specific package using ie lerna run --scope feathers-hooks test

Also, do we just delete the old repositories then? It would be nice not having so many in the org.

yes! it's handled by lerna using lerna import which copies commit logs, very cool. another benefit for ecosystem such as feathers is that lerna manages common root devDependencies which speeds up installs for development/CI and also guarantees that versions are in sync to minimize dependency graph for downstream consumers

eddyystop commented 8 years ago

I replied in https://github.com/feathersjs/feathers-hooks-common/issues/31 that

There are 26 hooks. 5 are 15-40 LOC (incl comment & blank lines) 15 are 5-15 LOC 3 are 1-5 LOC The remaining 3 are not large. I lost track of their sizes while counting.

I don't see creating 26 repos, 21 of them having less than 15 lines of code, including comment and blank lines.

justingreenberg commented 8 years ago

@eddyystop i agree, breaking up feathers-hooks-common may not be the best test case.. if the idea was to test drive, maybe feathers-authentication@1.0.0 core, plugins, strategies etc would be a logical group with sufficient meat and interdependencies to justify monorepo

daffl commented 8 years ago

It's not about the LOC. The problem is having a breaking change in one LOC in one hook and still having everybody go through the whole migration from one major version to another (which means updating the version in your dependencies, reading the migration guide to see if something relevant changed and then re-running all your tests) even if they only use one of the other 25 hooks that didn't change at all.

The point of using Lerna is not having to create 26 repos (it's still in one) but still having 26 separate modules for it on npm. So you can make a breaking change in feathers-hooks-populate but anybody who is just using feathers-hooks-iff won't have to worry about it.

KidkArolis commented 8 years ago

Personally, I'd find it more annoying to install hooks individually.. I wouldn't mind updating major hooks version if I need new behaviour.

But yeah, I see both sides.. On Sat, 26 Nov 2016 at 02:05, Eddyystop notifications@github.com wrote:

I replied in feathersjs/feathers-hooks-common#31 https://github.com/feathersjs/feathers-hooks-common/issues/31 that

There are 26 hooks. 5 are 15-40 LOC (incl comment & blank lines) 15 are 5-15 LOC 3 are 1-5 LOC The remaining 3 are not large. I lost track of their sizes while counting.

I don't see creating 26 repos, 21 of them having less than 15 lines of code.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/feathersjs/feathers/issues/462#issuecomment-263038859, or mute the thread https://github.com/notifications/unsubscribe-auth/AATzWBVqBS9ruRqgqrIayNzSYc61fyHuks5rB5P_gaJpZM4K0nqu .

ekryski commented 8 years ago

@daffl's point is a fair one. It's not about lines of code but more about functionality and the scenario he put forward is going to happen in the near future. That being said:

Thoughts on Hooks

I'm on the fence about breaking up feathers-hooks-common into their own repos. At this point I think that's starting to be overkill and the maintenance side on our end (and even developers) far outways the benefit for server side usage, however hooks can be used client side (and should be) so being able to install them individually (like Lodash) can reduce build size (not sure if you can do that with Lerna).

Now, the current hooks repo is pretty small (~30k not gzipped or minified). It's nowhere near the size of lodash so I think, if grouping is primarily about functionality and not lines of code, for now we should keep them all together in the same repo. Using Lerna might actually make that easier to publish micro versions per hook to handle your case there @daffl.

Thoughts on Lerna/Monorepo

In general I agree with @justingreenberg. Thanks for the suggestion 😄. I personally would prefer to have to only have 1 sublime window open and not have to symlink everything manually. With the new version of auth it was a real challenge publishing everything (and still is until it is out of beta) because if there is a breaking change that affects all the modules tests will break until they are all published (or we have to do some hack workaround where we point to the master branch temporarily). I too like the structure you suggested. However, I have some reservations (at least for now) because this actually has a lot of implications and it might just be a "make work" project at this point in time. I don't foresee us adding a lot more new repos. Maybe just 5-6 more and some will go away.

I think if we were to go with Lerna for sure we would have to use independent mode. React and React Native are all monorepo solutions (not using Lerna) and that is why they need to release so often and why they release so many breaking changes (which sucks). It also makes it much harder to resolve conflicts when there are lots of contributors. They are actually in the process of decoupling a bit more...

The benefits of a monorepo to us would be:

Easier for devs to see pieces we maintain and what is involved in Feathers
Easier for us to work on and publish related modules
Less work to ensure that the tests and test coverage are added when adding a new "repo"
Less work for us to manage permissions for publishing, etc. among team members
Less repos to keep up to date (ie. docs, issue tags, etc)
Easier more people to report issues and for us to manage and prioritize
Major releases would better coincide with release posts

The downsides:

Outside contributors have to be granted access to the the entire repo instead of just the modules they care about. This is going to make it harder for people to contribute on new stuff and slow down iteration on new plugins because we'll need to scrutinize releases/PRs more heavily.
We have to change our tooling to accommodate running tests faster
We may have to change how we do releases
We might not be able to release as quick (sounds like we will but this is a risk right now)
A repo with less code is more approachable for new devs instead of getting bogged down in the entire ecosystem
Code Climate and test coverage score would be brought down every time a new module is in flux adn we would likely have a lot of fluxuation (basically making the score useless).
We will end up with a lot more merge conflicts and rebasing.

Other considerations beyond @daffl's questions:

Where do examples live now? Each module has it's own example, do we move them to their own repo to reduce size? Move them top level or leave them where they are?
How do the .travis.yml files look now? We have to spin up different databases in order to run tests and setting up all of them in one build would slow down build times.
How does this impact front-end builds and front-end build sizes?

A bonus would be that we could create some real integration tests

^ @daffl we can do that now. We could do that in another repo 😉. I'm already doing integration tests in feathers-authentication and feathers-authentication-client.

I'm hesitant to introduce a new tool right now (since we just made a bunch of tooling changes) but let's definitely evaluate it for the Buzzard release in the new year. That said, I'm also wondering "Is this the most important thing to work on?". I think ecosystem discovery can be better improved by documentation (which we started) and tooling can be added to make things more efficient on our side around updating repos. @corymsmith already made some scripts to do this.

I want to make sure we know how much of negative impact this is going to have so let's do a time boxed trial run. 1 person on @feathersjs/core-team take 1 day to try and convert to Lerna and see how it works and report back on all the concerns/risks in here. Any volunteers? If it doesn't happen in the next couple months I'm going to assume that there are higher priority issues.

daffl commented 7 years ago

I didn't see @ekryski's reply but those are all great points. Another one I just thought of is Greenkeeper. I did some quick research and I don't think it can send PR's against repositories with multiple package.json files in it. Seeing as how helpful it has been so far it would be a show stopper not to have it.

daffl commented 7 years ago

justingreenberg commented 7 years ago

@ekryski these are all good points, especially with respect to cost-benefit

i recently stumbled across https://github.com/knitjs/knit which is comparable to lerna but is built on yarn and seems to address some of these isses:

single root package.json
- shared dependencies remain in sync
- @daffl works with greenkeeper :)
builds would be much faster than lerna which uses npm (both travis and locally)
built-in tasks (publish, release, etc) are exposed, so it looks more flexible and easy to extend

of course knit is not nearly as battle tested and i haven't used it yet, but wanted to provide this reference as an alternative to lerna. i will try it out in my project after the holidays and let you know how it goes!

frank-dspeed commented 7 years ago

@daffl look into renovate its greenkeepers core it supports mono repo multi package.json out of the box

daffl commented 6 years ago

Looks like Greenkeeper now supports grouping package.json files in subfolders (see https://github.com/greenkeeperio/greenkeeper/issues/139#issuecomment-381932855) so this might be worth revisiting.

Another thing to have a look at would be to contact GitHub support to see if we can set up redirects from the old modules to the existing monorepo repository.

petermikitsh commented 6 years ago

This comment by @DesignByOnyx highlights an important limitation of the distributed repo model. With distributed repos, you might be accepting PR's that could create integration issues in other feathers projects. But it can be difficult for a contributor to be aware of that.

I've seen this exact same problem in other projects with a feathers-like multi-repo setup. You end up making patch releases of one module, just to test it with a different module that depends on it. It's quite a bit of overhead.

By moving to a monorepo model, you can test code changes across all repos together and increase confidence in your PR's.

claustres commented 6 years ago

My two cents on this because we are also studying a mono repo solution on our side for a framework having about seven independent repos right now. In the past we faced the same problems for two others frameworks used in about a dozen of projects, so to say that a perfect solution does not probably exist :-(

First we have used the mono repo solution. It was great to ensure integration, we even had a complete sample application with end-to-end testing. The biggest challenge was configuration management, i.e. tags. Whenever you need to tag a module eg after a patch you were required to tag the whole repo so that the version numbers in your modules become uncorrelated from tag numbers in Git or you are required to increment all module version numbers at the same time. This might be a solution for a coherent set of modules like FeathersJS core but not for the ecosystem that need more flexibility IMHO.

We have also used independent repos and a tooling suite, mostly based on https://source.android.com/setup/develop/repo, to ease management (we also tried others things like submodules, which where not adequate to keep everything in sync). It was easier to work independently on modules so that if a module needed to be tagged nothing else was affected. We also had a complete sample application with end-to-end testing managed as an independent repo. It was more easier for a beginner because he could start on a module then jump to the app to integrate. The problem is that you need specific tools increasing the learning curve. Such tools might not exist at all eg to handle linking a set of modules with yarn/npm.

My current mind is that we should use tools where they have the best fit. Git is good to do version management so we should use it for that, which almost implies keeping separate repos. NPM is just another way to distribute Git artefacts. Indeed, the fact that PRs could create integration issues is not really related to the single or multi repos problem but to the fact that no integration or end-to-end tests do exist. Last but not least, I think the future will gradually evolve to microservices so that maybe we should better make each module become a stand-alone app (at least with some deployment configuration) that could be deployed and integration-tested in a testing platform.

I don't know lerna enough to say if it can handle some of my mentioned issues, let me know.

claustres commented 6 years ago

I would add that examples like react and babel does not seem really similar because these are frameworks or tools where you plug everything into the same process at the end, without deployment issues since these are frontend-only things. First, Feathers has a client and server part that makes it a less "integrated" environment. Second, you can create independent services with Feathers that are deployed in a monolithic app or independently in multiple apps (aka microservices).

frank-dspeed commented 6 years ago

@claustres view metarepos thats what i am going for i do individual repos and then a main integration mono repo via git submodules.

frank-dspeed commented 6 years ago

@daffl this can be closed because of the new Feathers versions.

claustres commented 6 years ago

@frank-dspeed It seems interesting, do you only use submodules in "read-only" mode for integration purpose ? Indeed in the past I evaluated it as well so that developers could directly work in an integrated environment while being able to commit their changes to independent modules linked using submodules. However due to the way git handle submodules it was not really usable (ie detached head state by default), synchronizing submodules back and forth is not so easy.

frank-dspeed commented 6 years ago

@claustres the integration repo holds automation and releasing scripts called via git hooks

daffl commented 6 years ago

Started spiking out the move to Lerna v3 and I think it's the way to go:

Automatic linking makes cross-repo development much more efficient
Lower barrier of entry for contributors
Ability to run integration tests and code coverage and quality reports over all modules in the @feathersjs namespace
Big timesaver only having to manage a single repository on third party service like Travis CI and Codeclimate
Easier way to write scripts to make sure that all modules have the same structure

Some notes around tooling:

Greenkeeper now supports monorepos
The changelog generator has to be changed to https://github.com/lerna/lerna-changelog

Repositories to import:

[x] @feathersjs/feathers
[x] @feathersjs/commons
[x] @feathersjs/errors
[x] @feathersjs/express
[x] @feathersjs/transport-commons
[x] @feathersjs/socketio
[x] @feathersjs/primus
[x] @feathersjs/rest-client
[x] @feathersjs/socketio-client
[x] @feathersjs/primus-client
[x] @feathersjs/configuration
[x] @feathersjs/cli
[x] generator-feathers
[x] generator-feathers-plugin
[x] @feathersjs/authentication
[x] @feathersjs/authentication-local
[x] @feathersjs/authentication-jwt
[x] @feathersjs/authentication-oauth1
[x] @feathersjs/authentication-oauth2
[x] @feathersjs/authentication-client

claustres commented 6 years ago

I am pretty sure that writing down some reporting here on how things go (weaknesses, advantages, ...) would be really valuable as a real-world example of a migration from multirepos to monorepo on a "large" and popular framework.

daffl commented 6 years ago

This has now been completed and all Feathers core modules have been released from the new monorepository at. I wrote up some of the advantages in the FeathersJS summer summary.

In general and also thanks to the help of @bertho-zero the transition has been pretty smooth and the latest Lerna made importing the projects very straightforward. The only thing I haven't figured out yet is if Lerna can figure out dependencies between projects and update the version numbers before publishing (the issue might be some circular dependencies which will be fixed during the implementation of the new version of the authentication plugin). There is also some additional small improvements that should be made for running all tests but besides that, I am so far quite happy with this change and its advantages.

lock[bot] commented 5 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue with a link to this issue for related bugs.

feathersjs / feathers

RFC: Use Lerna with monorepo for core packages #462

Thoughts on Hooks

Thoughts on Lerna/Monorepo