strongloop / loopback

LoopBack makes it easy to build modern applications that require complex integrations.
http://loopback.io
Other
13.23k stars 1.2k forks source link

LoopBack Next #1380

Closed ritch closed 7 years ago

ritch commented 9 years ago
  1. As a LoopBack user, I want a stable release that I can use in production and that will keep receiving (important) bug fixes, ideally for the whole life time of my application life. (I.e. a LTS release.)
  2. As a LoopBack user developing a new application, I want to start with a recent LoopBack version that has all the new bell and whistles, but which is at the same time stable enough for day-to-day use and which will reasonably soon mature into a stable release (ideally LTS).
  3. As a LoopBack contributor submitting a bug fix, I'd like to get the bug fixed quickly in all supported versions, so that I can upgrade my production servers with the fixed version.
  4. As a LoopBack contributor submitting a backwards-compatible feature, I'd like to get my contribution released in a stable-ish version which I can use in my day-to-day development and which will mature into a stable release before I deploy the app to production.
  5. As a LoopBack contributor submitting a breaking change, I'd like to get my change landed reasonably soon. I don't want to wait one year before a next major version is scheduled.
  6. As a LoopBack maintainer, I need to ensure that both stable (see 1) and stable-ish (see 2) branches meet our quality standards, so that the amount of bug is as low as feasible, the codebase allows use to quickly build new stuff without being slowed down by sloppy code and with confidence that the changes won't introduce regressions.
  7. Everybody wants predictable release dates and predictable quality.
  8. As an advanced user of LoopBack, I'd like to try out new or expirmental features without much effort so I can take advantage of new improvements to LoopBack in existing applications.
altsang commented 9 years ago

@ritch I think the right ideas are here, but I wasn't envisioning this to be a whole separate repo, just another branch in the current loopback repo

bajtos commented 9 years ago

I'd like to hear @piscisaureus and @bnoordhuis opinions too, they have quite some experience with this stuff from working on the Node.js core.

bnoordhuis commented 9 years ago

A single big repository is a good idea. Splitting everything into little standalone repositories is only appealing until you start making big changes, then it becomes a total pain.

@ritch I would think about a strategy of how to do releases and how (or if) to back-port changes. Is -next going to be an eternal dev version? Do you branch off a stable branch every now and then?

ritch commented 9 years ago

@ritch I think the right ideas are here, but I wasn't envisioning this to be a whole separate repo, just another branch in the current loopback repo

I don't think that is going to do all that we really need. Though it would make it easier to pull in the latest.

I would think about a strategy of how to do releases

I would like to follow our current approach with a few minor tweaks:

how (or if) to back-port changes

I guess I didn't think about making this the dev repo but that is an interesting idea. This would mean we would only contribute bug fixes to the strongloop/loopback repo and the obvious way to do this would be by back porting them when they are fixed in loopback-next.

That idea would really need buy in from @bajtos @raymondfeng and @fabien since it would mean that they should start making all contributions to this new repo and help back port patches (which will be somewhat painful, but worth it IMO).

ritch commented 9 years ago

@bajtos I'm interested in your take on the review process for a fast moving repo. Could we get away with no review at all? No. But what can we do to ensure progress is made very quickly. Perhaps only ensure features are not flawed (should be discussed before PR) and accidental bugs are not introduced as well as automated checks are passed (including linting, but perhaps adding code coverage for tests?).

altsang commented 9 years ago

what I had envisioned was that this wouldn't be the next version of LoopBack - that it would be an experimental branch where select(?) features are integrated and tested out by users who want the latest and greatest and then we would integrate those features back to the stable branch in a series of release candidates or milestones release and that would eventually become the next version of LB stable

altsang commented 9 years ago

@bajtos I'm interested in your take on the review process for a fast moving repo. Could we get away with no review at all? No. But what can we do to ensure progress is made very quickly. Perhaps only ensure features are not flawed (should be discussed before PR) and accidental bugs are not introduced as well as automated checks are passed (including linting, but perhaps adding code coverage for tests?).

BTW we have that user story assigned to @raymondfeng to come up with minimum acceptance requirements for both stable and fast moving branches, @bajtos can contribute to that effort once he has something drafted

ritch commented 9 years ago

what I had envisioned was that this wouldn't be the next version of LoopBack - that it would be an experimental branch where select(?) features are integrated and tested out by users who want the latest and greatest and then we would integrate those features back to the stable branch in a series of release candidates or milestones release and that would eventually become the next version of LB stable

what I had envisioned was that this wouldn't be the next version of LoopBack

I think I know what you mean here, but keep in mind no matter what it would be a new version in some way. That means we have to adhere to semantic versioning. Semantic versioning dictates that a breaking change cannot be made within the same major version. This is why our master branch is slow moving and why it is difficult (or impossible) to add experimental features. This is compounded by the fact that this same problem exists for the dependencies (juggler, remoting, boot, etc).

Perhaps we should first focus on the goals we need to meet and then we can collectively determine the best way to meet them. @altsang from your comment I can gather a few (feel free to add more):

rmg commented 9 years ago

A single big repository is a good idea. Splitting everything into little standalone repositories is only appealing until you start making big changes, then it becomes a total pain.

I'm going to have to disagree with @bnoordhuis here. I agree that it can be painful when needing to make large changes across multiple repos (I was there!), but I think dealing with it by merging multiple modules into a giant repo is shooting yourself in the foot in the long run. You lose sensitivity to kind of coupling that was the real source of the pain in the first place and I don't think it is reasonable to rely on discipline to keep that sort of problem in check.

altsang commented 9 years ago

to be clear I wasn't thinking of one single big repository consisting of...

and the others, just a branch in each with the same name

I think @ritch has the right idea to focus on goals. I'll think and add some per your comment. Based on the discussion and difference of opinion I would say one of them would be ...

"As a LoopBack contributor, I would like an easy way to integrate my changes to the "fast moving" version of LoopBack so that I know that my contributions can be tried out without affecting the stable release."

ritch commented 9 years ago

@altsang added your story to the issue

bajtos commented 9 years ago

I'll post a longer reply hopefully tomorrow. For now:

Perhaps we should first focus on the goals we need to meet and then we can collectively determine the best way to meet them.

:+1: :heavy_multiplication_x: :100:

I feel the proposal in the issue description and a large part of the subsequent discussion is focused on WHAT and HOW we are going to change, but I am missing WHY we are making these changes in the first place. What problems (user stories from the POV of a LoopBack contributor, a LoopBack maintainer, a LoopBack user) are we trying to solve/address here?

bajtos commented 9 years ago

BTW let's add @clarkorz and @STRML to the discussion too.

bajtos commented 9 years ago

First of all, thanks @ritch for starting this discussion, I think it's very valuable.

Secondly, the long text below express my opinions and feelings that are often subjective and may be even wrong sometimes. Please treat it as a source of ideas (and concerns) for further discussion, not as something set in stone.


Here are few stories that feel important to me:

  1. As a LoopBack user, I want a stable release branch that I can use in production and that will keep receiving (important) bug fixes, ideally for the whole life time of my application life. (I.e. a LTS release.)
  2. As a LoopBack user developing a new application, I want to start with a recent LoopBack version that has all the new bell and whistles, but which is at the same time stable enough for day-to-day use and which will reasonably soon mature into a stable release (ideally LTS).
  3. As a LoopBack contributor submitting a bug fix, I'd like to get the bug fixed quickly in all supported versions, so that I can upgrade my production servers with the fixed version.
  4. As a LoopBack contributor submitting a backwards-compatible feature, I'd like to get my contribution released in a stable-ish version which I can use in my day-to-day development and which will mature into a stable release before I deploy the app to production.
  5. As a LoopBack contributor submitting a breaking change, I'd like to get my change landed reasonably soon. I don't want to wait one year before a next major version is scheduled.
  6. As a LoopBack maintainer, I need to ensure that both stable (see 1) and stable-ish (see 2) branches meet our quality standards, so that the amount of bug is as low as feasible, the codebase allows use to quickly build new stuff without being slowed down by sloppy code and with confidence that the changes won't introduce regressions.
  7. Everybody wants predictable release dates and predictable quality.

Let me illustrate the last point on a couple of counter-examples from Node core:


Honestly, I am not a fan of long-living branches, regardless of whether we are talking about feature branches or fast-moving vs. slow-moving branch. In my experience, they usually create a huge debt by making things easier in the short-term at the expense of pushing a lot of work to the future.

Especially the notion of lessening our minimum acceptance requirements for a fast moving branch is very worrying to me. I am concerned that such setup may cause the following consequences:

As a result, we will probably end up in a Node v0.10 and/or Node v0.12 scenario described earlier.

A side note: I feel the idea of two branches moves us towards infrequent big releases. As I see it, big releases are very difficult to get right and that's one of the reasons why many projects seems to move towards smaller and more frequent releases.


Let me describe a process that I personally consider better than stable/fast-moving branch.

Start by defining three version (release, branch) types:

All new work is made on the master branch. Minor and patch versions are released frequently (we are already doing this now).

Experimental and backward-incompatible changes are implemented behind a feature flag that is disabled by default. These changes are marked as experimental in the documentation and may change at any time (a breaking change does not bump up the major version). This can work similarly to V8 flags like "--harmony".

Features we would like to remove are marked as deprecated first.

There is only one set of minimum acceptance criteria. If we feel a pull request is valuable even though it does not meet the criteria, it's up to the maintainer to clean up the patch to meet them. This way we are not introducing technical debt.

We can make one exception from this rule. A new feature marked as experimental and added behind a feature flag may be landed even if it lacks test coverage and does not handle correctly all edge cases, as long as the users not using the feature are not negatively affected.

Major releases

At a short-ish regular interval (e.g. every three months), we take an opportunity to make a major release. The work of preparing a major release should be pretty straightforward and easy to plan:

Once this is done, we will make an RC pre-release (e.g. 3.0.0-rc.1) to allow users to test the upcoming version. The we will set a deadline (e.g. two weeks after the last RC) and if there are not major bugs reported in that time, then we will promote the RC to a new major release.

LTS releases

At a regular interval (e.g. once or twice a year), decide which major release will become a new LTS release.

An example

Let's say we will mark 2.x as the current LTS release. We have a bunch of small cleanup stories in GitHub, thus we can set up 1-3 weeks of time to make the changes and release 3.x from the master. For the sake of this example, say that 3.x will not become a new LTS.

As time goes, we keep adding new features to the 3.x version range (living in GitHub master branch) and backporting all (important) bugfixes to the 2.x version range (living in GitHub branch 2.x).

In August or September, we have another opportunity to release a major version. We start by reviewing the list of feature flags we have and decide whether we have any material for a new major version. Let's say we do have changes that need a new major version and we release 4.x.

At that same time, we have more feedback about the 3.x version and can decide that 3.x is stable enough to become the new LTS. The result: 2.x is no longer supported, 3.x is the new LTS, 4.x is the new "unstable" branch.

We can also decide that 2.x should stay as LTS for some more time. In that case 2.x is still supported as LTS, 3.x is supported too (because there is no LTS with providing all features of 3.x) and 4.x is the new master branch.

In other words, when we decide to backport a bugfix, it will be backported to all major versions back to the latest LTS.

Multi-repository changes

Now how to make changes that affect multiple repositories?

If it's a radical change, then it may be worth implementing it in a different GH repository that would contain all affected projets, as was described in this issue. However, this new repository should be short lived, it should be just a bigger version of a feature branch we use for regular development.

Once the change is worked out enough to be merged to the main repositories, the contributor will create regular feature branches and pull requests in the appropriate repositories. When these pull requests are landed, the "feature repository" can be deleted.

How to install a "feature repository" instead of regular npmjs version? One of the many solutions is to use github tags together with module-relative paths:

// package.json
{
  "dependencies": {
    "loopback-next": "strongloop/loopback-next#v0.2"
  }
}

// source code
var loopback = require('loopback-next/loopback');
var juggler = require('loopback-next/loopback-datasource-juggler');
var remoting = require('loopback-next/strong-remoting');
var boot = require('loopback-next/loopback-boot');

Having wrote all of this, sometimes its better to pick an imperfect solution quickly rather than hold an endless discussions. If the rest of the team believe that loopback-next or some other form of two-branch setup is a better solution than what I proposed here, then I am open to try it out. At the end of the day, we can always revisit this decision later, when we have more experience with the real consequences and a better understanding of what works and what does not.

sam-github commented 9 years ago

Multiple branches seem superior to multiple repositories to me, particularly mega-merged ones. Having the branches in one repo makes it easier to cherrypick back and forth between them.

Ideally, if changes were considered useful, they could be published to npmjs.org, with a bumped major so that they don't get installed when loopback@2 is installed. And majors are cheap, if the change was later considered a bad idea... it can be reverted/fixed up, and the module can be published again with a new major.

If its really "playground", and you really do not want to commit to using semver, then it might be better to have a "next" branch that is continuously published to a non-npmjs.org registry, basically what we do with our CI registry. The understanding would be that breaking changes can occur in the next registry with no change in semver, and anybody using it wouldn't complain.

How many breaking changes are actually proposed?

altsang commented 9 years ago

@bajtos did a better job than I could've stating the "goals" in user story form - :clap:

please note that in my original thinking - in no way, shape or form was the fast moving, experimental branch every going to be hardened into a stable release as the "next" branch. I like @sam-github 's idea of publishing to a non-npmjs.org registry. My thoughts were that features or changes that we didn't have a a large use case around would be vetted out there first, trailed by fire and usage (or the lack thereof) and integrated back to the stable branch by maintain reviews

piscisaureus commented 9 years ago

In $0.02 fashion -

Working with multiple repos is really painful. If certain modules are tightly coupled we should just bundle them and keep them in one git repository.

The websocket/mesh stuff I worked on last sprint spanned four modules. It was really painful and I think more than half of my time spent was "getting it to work in the first place" with the different master branches.

bnoordhuis commented 9 years ago

There was an article on HN earlier today that gives a good overview why a big repo > small repos: http://danluu.com/monorepo/

High-level reasons:

bnoordhuis commented 9 years ago

And a point in case: the concurix tracing stuff is spread out over at least 15 repos. Trying to make changes, or even just trying to follow the flow is a complete headache.

piscisaureus commented 9 years ago

Ben and yours truly are really in the "just put it all in one repo" camp others seem to really like this "modules" thing. I wonder where the difference comes from - maybe @rmg @sam-github @bajtos have found a workflow that removes all the pain.

But here's my workflow, in pseudo-code - suppose I am asked to fix some windows issue in some project.

  result = look_at_github_repos_to_figure_out_the_bug_might_be();
  if (result.itsReallyObvious) {
    tell_sam_how_to_fix_it();
    return beer();
  }

  var modules_to_edit = result.github_repos_to_look_at.map(function(v) { return v.replace(/.*strongloop\/, '') });
  modules_to_edit.push('strongloop');  // add strongloop/strongloop

$ mkdir work && chdir work

  /* Git clone all the modules that are identified as potentially causing the
   * issue.
   */
  for ($MODULE in modules_to_edit) {
$   cd ~/work
$   git clone $MODULE

    /* Install the module's dependencies */
$   cd ~/work/node_modules/$MODULE
$   npm install

    /* Since the modules that were checked out from git might have a dependency
     * on one another, npm might have installed additional copies of modules
     * that we are going to edit.
     * Therefore, delete all of the local copies in node_modules so the ones
     * that were checked out with git are used.
     */
$   cd ~/work/node_modules/$MODULE/node_modules
    for ($MODULE2 in modules_to_edit) {
      try {
$       rm -rf $MODULE2
      } catch (err) {}
    }   
  }

$ cd ~/work/node_modules/strongloop

  /* It's possible that the unusual mix of "released versions" from NPM and
   * master branches from github won't work together. Therefore, call in higher
   * powers to avoid greater injustice.
   */
  sacrifice_a_goat()

$ npm test

  var fixed = false;
  do {
    /* Hopefully my initial guess at this point was correct and the bug isn't in
     * a module that I didn't check out with git. If not, make_edits throws and
     * I have to start all over.
     */
    fixed = make_edits();
  } while (!fixed);
rmg commented 9 years ago

@piscisaureus I've found combined repos to be painful for everything except development setup. Release management, CI, repo management, and all the processes, tooling, and automation around those things all become a lot more complicated and in a way that doesn't scale.

There is a stage in a module's lifecycle where it is one bit monolithic code base with a bunch of baby modules inside it waiting to grow up and move to their own repos. Moving day is when the pains I described above become greater than 0 (eg. you want/need to publish one of the submodules separately).

sam-github commented 9 years ago

Sorry, I'm with @rmg here. When things are _closely_ coupled, having it one repo can be helpful. nodefly was such an example. No piece was standalone, the collector, gateway, etc, could ONLY be used together. Putting them in one repo was the right thing to do, and helped us a lot. But at this point, the fact that strong-agent is nestled inside a large code-base that hasn't been touched in coming up to half a year is becoming odd.

When you are implementing one particular feature, and you cross module boundaries, it gives you a false view. It appears like everything would have been simpler bundled together. But if you were trying to make a small change that only effects one repo, say adding a behaviour to slc run, having to wade through a repo containing every line of the nodeops stack would not be great.

strongops, for example, takes a bunch of custom tooling to npm link all its pieces together, one more piece of complexity someone new to the codebase has to deal with. And Tetsuo even got side-tracked into cloning strong-agent, assuming it was actually the strong-agent git repo... when its not.

strong-pm, FYI, is about to be chopped into more repos, because its too large and unwieldy. In particular, lib/driver is going to be moved into another repo, because we need it for the executor, and lib/auth.js, because we need it for central.

I understand why it was confusing to figure out where the code was, but when lib/runner.js got chopped out of strong-pm.git, and put into strong-runner.git, all its test code got pulled out, it ran faster, was easier to find (instead of being mixed in with the test code for all the rest of strong-pm), and also made the strong-pm tests run faster (they are still slow, but at least all the runner tests are now run in a different repo). It made working on the runner easier, and it made working on pm easier, even though it made it harder to show Bert where all the code was, I've no regrets, yet. :-)

Its not all roses, it means cloning more repos, it means linking more repos (though you only need to link a repo if you intend to CHANGE it). Whether the code is in one repo, or across several, its always nice to have a roadmap, but we don't have that. That's too bad.

Also, going back to LoopBack. Using seperate modules for distinct components allows them to be independently semvered, and to have their majors get bumped so then can be iterated on quickly without effecting the stable major release.

rmg commented 9 years ago

After reading the article @bnoordhuis referenced I think my opinions require context:

I'm responsible for the tooling and automation for well over 100 modules and repos and a disproportional amount of effort went in to supporting one of those being a monorepo because it became a snowflake and to this day that monorepo is a second class citizen with only partial tooling support.

altsang commented 9 years ago

if things were all in one monolithic repo, I presume the structure would be a folder for every repo under root? so if I changed something in /loopback-boot/... how hard or easy would it be to cherry pick that change and apply it to strongloop/loopback-boot if and when that feature was well received and we wanted to apply it to the next version of LB?

sam-github commented 9 years ago

I just read the monorepo blog, too, and I don't disagree with it for codebases composed of an amalgam of python, c++, go, etc. (or even the python+lua+C+java codebase I had at wurldtech, which we merged into a single repo for ease of use from its former multiple repos), but it doesn't account for node, in particular, not for npm.

you need to have some way of specifying and versioning dependencies between them. That sounds like it ought to be straightforward, but in practice, most solutions are cumbersome and involve a lot of overhead.

npm and package.json

This sort of Lego-like development process does not happen as cleanly in the open source world.

node is an exception

Refactoring an API

npmjs.org + semver

My experience of working with npm is while it can be maddeningly rigid and baroque at times, working against npm is, on the whole, worse than working with it's strengths, and accepting its weaknesses.

bnoordhuis commented 9 years ago

if things were all in one monolithic repo, I presume the structure would be a folder for every repo under root?

Yes.

so if I changed something in /loopback-boot/... how hard or easy would it be to cherry pick that change and apply it to strongloop/loopback-boot if and when that feature was well received and we wanted to apply it to the next version of LB?

Not hard. git lets you add and strip directories from file paths in patches or you can git subtree it.

Jeff-Lewis commented 8 years ago

JM2C - I think you guys are doing an excellent job managing the multiple repos of StrongLoop and I see it as a great example of how to divide and conquer using node and npm. Don't forget guys, StrongLoop's product offering and feature sets covers a lot, so much that I can only compare it to other large enterprise software companies that have been around a lot longer. You guys have made incredible progress in a short amount of time partly because of the insightful separation of concern in repo organization.

It's only natural with the size and scope of the Strongloops's software to now find it a little painful to make cross-repo changes, more rapid releases, etc. I don't see how a monolithic repo will help with the design of changes and more importantly, the overall architecture of the platform. In the end, it's a team of developers needing to work together to make those changes regardless of the folders the code is in. As developers, we like to find solutions with our keyboards, but the solution here might be with the phone/webcam.

Don't let the 'grass is greener' steer you away from the road you guys have been on. We only need to derive some new developer workflows for handling such as large code base across many repos.