Enforce citgm for major changes

MylesBorins commented 7 years ago

We should do this.

At the very least it will document in the thread that we care about these breakages

jasnell commented 7 years ago

I'm fine with this. Would need a PR adding the requirement to the collaborator docs. If it's not there already, we should make it a required step for all releases also... perhaps even listing the failures in the release notes as known issues.

addaleax commented 7 years ago

Sounds good. Two thoughts:

Reading CITGM output is hard because there are so many known/expected failures, and it’s not obvious where those come from. Can we have a tracking issue at nodejs/node with all the failures?
While error message changes are still semver-major, can we exclude them from this rule?

jasnell commented 7 years ago

It makes a ton of sense to exclude certain types of semver-major's from this rule:

error message changes
semver-major changes to Node.js' own tooling, build, test framework or benchmarks

Generally, only semver-major changes in src/ and lib/ would be subject to the rule.

sam-github commented 7 years ago

Don't we want to know when an error message change breaks a package in npmjs.org?

jasnell commented 7 years ago

Sure, but those do not necessarily need to be identified before landing the PR... we're also not likely to use such a failure to determine whether or not to land such PRs. If we're changing the error message, we're likely doing so for a very good reason.

cjihrig commented 7 years ago

SGTM. Like @addaleax, I'd like easier to understand CITGM output.

What happens when a breaking change actually breaks something in CITGM? Would we just never break anything ever again? Is that something we'd decide on a case by case basis?

jasnell commented 7 years ago

We would need to decide on a case-by-case basis. CITGM is not a gate, it's an early warning system. It helps us to understand what the potential impact a change may have but it should never stop us from making any particular change.

MylesBorins commented 7 years ago

Fwiw the citgm team is fairly active in keeping the lookup table up to date regarding flakes. I'd be more than happy to teach people how to read the leaves, but I general failures that are not timeouts are legit failures

On May 11, 2017 11:12 PM, "James M Snell" notifications@github.com wrote:

We would need to decide on a case-by-case basis. CITGM is not a gate, it's an early warning system. It helps us to understand what the potential impact a change may have but it should never stop us from making any particular change.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nodejs/CTC/issues/126#issuecomment-300918451, or mute the thread https://github.com/notifications/unsubscribe-auth/AAecV9F2MfWvHTVIJm4IVfiOnNE_IrgCks5r43m6gaJpZM4NYH31 .

jasnell commented 7 years ago

Oh, I don't doubt that at all @MylesBorins. I just don't believe that failures we happen to see in citgm should block changes that we feel should be made. For example, we will never get out of the cycle of treating error message changes as semver-major unless we continue working towards assigning error codes and making the error messages more consistent -- that work could very well break existing code but the benefits far outweigh the potential disruption to existing modules.

mhdawson commented 7 years ago

I think we need to get CITGM to the point where it normally just runs/passes (at least one variant that we required for semver-majors). One approach may to be to fix the module versions.

Provided that the jobs run/pass reliably and takes less than 20 minutes I think requiring it for all semver majors should not be to much to ask. The issue of whether it blocks landing the PR or not can be handled on a case by case basis. Requiring the run would mean that we know where we stand earlier on as opposed to having to figure that out at release time.

MylesBorins commented 7 years ago

Locking the versions means we don't test the module that people get when they run npm install

I cannot see any way to scalably update the lookup, especially as we add more modules. The majority of failures we see are platform related flakeyness, much of which can be specific to our infra.

Seeing green citgm because we lock modules is not an accurate representation of the ecosystem

On May 12, 2017 9:30 AM, "Michael Dawson" notifications@github.com wrote:

I think we need to get CITGM to the point where it normally just runs/passes (at least one variant that we required for semver-majors). One approach may to be to fix the module versions.

Provided that the jobs run/pass reliably and takes less than 20 minutes I think requiring it for all semver majors should not be to much to ask. The issue of whether it blocks landing the PR or not can be handled on a case by case basis. Requiring the run would mean that we know where we stand earlier on as opposed to having to figure that out at release time.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/CTC/issues/126#issuecomment-301076682, or mute the thread https://github.com/notifications/unsubscribe-auth/AAecV2L0WzURFyJeYZVutixOUSbfKd-Gks5r5F70gaJpZM4NYH31 .

mcollina commented 7 years ago

Currently there are a bunch of failures in citgm against master. I guess this is to be expected, as we do land breaking changes in master. How can we hold changes to have an all-green? How do I know my change did not break them more?

I have a very specific case for this, my stream.destroy() PR: https://github.com/nodejs/node/pull/12925.

It would be good to have the "timeouts" reported differently than actual failures.

MylesBorins commented 7 years ago

@mcollina unfortunately there is no way we can do introspection into the reasoning of a modules failure, so no way to know if failures are caused by timeouts. I would argue that master would not have a bunch of failures if we gated against citgm...

mhdawson commented 7 years ago

Its a bit of the chicken and egg in terms of:

we need to gate against citgm to keep citgm green
we need citgm to be green to gate on citgm

I think we have to plot a path to getting citgm being green consistently, otherwise its too hard to get the maximum value from it. Once we managed to get the core regression builds passing most of the time, it became much easier to keep them green.

Changes in the modules themselves do add a complication and if they continuously break citgm independent from changes people are making in core we'll need a way to decouple from that. We may need both pinned modules for the regression runs for core, and the latest for nightly/releases. If we can keep citgm green on the latest all the better and we only need the one. I just think we have to get to the point were anybody can run it an it pretty much passes unless there is a problem with the change they made.

refack commented 7 years ago

my 4_¢

I actually think that CITGM should primarily be a gate for semver-minor, they should be breakage free (unless it's already so, and I just didn't RTFM).

Its a bit of the chicken and egg in terms of:

we need to gate against citgm to keep citgm green

we need citgm to be green to gate on citgm

If it's possible for the CITGM team to estimate the amount of time to get the CI job green, we call a moratorium on landing breaking changes for this time window.

addaleax commented 7 years ago

I actually think that CITGM should primarily be a gate for semver-minor, they should be breakage free (unless it's already so, and I just didn't RTFM).

Everything but semver-major changes should be breakage-free (by definition).

refack commented 7 years ago

Everything but semver-major changes should be breakage-free (by definition).

[asking] Do we test that it's actually breakage-free (i.e. CITGM), or do we deduce it from code reviews and our regression testing?

MylesBorins commented 7 years ago

We test before every release

On May 17, 2017 7:58 AM, "Refael Ackermann" notifications@github.com wrote:

Everything but semver-major changes should be breakage-free (by definition).

[asking] Do we test that it's actually breakage-free (i.e. CITGM), or do we deduce it from code reviews and our regression testing?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/CTC/issues/126#issuecomment-302068264, or mute the thread https://github.com/notifications/unsubscribe-auth/AAecV93ce_3oVonwvyDju1UXYibpHCQcks5r6uDpgaJpZM4NYH31 .

MylesBorins commented 7 years ago

The TLDR on making CITGM useful is that someone needs to audit the results on a weekly basis and keep the lookup up to date for every release line.

This could potentially be automated... but I'm not sure of a clean / intuitive way to do so

addaleax commented 7 years ago

@nodejs/build We could have a citgm-daily (or citgm-weekly) like node-daily-master, right? That might help.

mhdawson commented 7 years ago

@gibfahn or myself can easily setup the required CI jobs. Once we agree on what's needed one of us can volunteer to set them up.

jasnell commented 7 years ago

Setting up the job is the easy part, yes? The real task will be ensuring that there is someone auditing the results. It would be interesting to see if we could have a page similar in nature to benchmarking.nodejs.org but that summarizes the results of the nightly citgm run.

refack commented 7 years ago

I'm volunteering to audit the results, but can't commit to every day. IMHO should be at least two people... (Hopefully by doing it a couple of times I will figure who to automate it)

Trott commented 7 years ago

Setting up the job is the easy part, yes? The real task will be ensuring that there is someone auditing the results. It would be interesting to see if we could have a page similar in nature to benchmarking.nodejs.org but that summarizes the results of the nightly citgm run.

For node-daily-master, I just have it as part of my daily morning routine to check https://ci.nodejs.org/job/node-daily-master/. No special page needed. It might be perfectly usable to check the equivalent page for a daily citgm run.

If @refack is volunteering to stay sufficiently knowledgable about what a clean citgm run looks like and what a problematic one looks like, then that may be all we need for minimum requirements to be effective. (More people would be better of course, but one is all you need.)

gibfahn commented 7 years ago

(More people would be better of course, but one is all you need.)

I think one is most definitely not all you need. You need one person available at any time, which translates to at least five people (in my experience).

refack commented 7 years ago

You need one person available at any time, which translates to at least five people (in my experience).

I'm hoping that will be a temporary sitch, and that will figure out how to automated it. FWIW we can start with a weekly (or semi-weekly) job.

refack commented 7 years ago

Until we find a better format: https://github.com/nodejs/node/wiki/CITGM-Status

refack commented 7 years ago

And bit more use off the WIKI last night to today diff

refack commented 7 years ago

Question: should we keep a lookup.json just for the CI, so we could choose a baseline and turn it green. It could be considered like a floating patch, since the CITGM's lookup.json takes into consideration a wider scope, and should be more parsimonious.

gibfahn commented 7 years ago

Question: should we keep a lookup.json just for the CI, so we could choose a baseline and turn it green.

Discussed in https://github.com/nodejs/citgm/issues/407

refack commented 7 years ago

Yay my insomnia pays off... We have our first hit https://github.com/nodejs/node/pull/13098#issuecomment-303079737 (well obviously not yay for the breakage, but for this validation process) /cc @aqrln

refack commented 7 years ago

Question: when encountering a regression, should I inform the module owners (i.e. open an issue)? Follow up: what's the criteria for a verified regression? for example 2 platform in 1 run, or 1 platform on 2 runs?

aqrln commented 7 years ago

@refack

when encountering a regression, should I inform the module owners (i.e. open an issue)?

I think that makes sense and is generally nice.

mcollina commented 7 years ago

@refack IMHO before pinging the author we should check if it was our fault or not. If the original PR was not semver-major and expected to break that, maybe no, we should try to fix things on our side.

refack commented 7 years ago

If the original PR was not semver-major and expected to break that, maybe no, we should try to fix things on our side.

It's definatly an open question, when there is a range of commits suspected probably the module author is the best resource for pinning down the "culprit"... For example in https://github.com/websockets/ws/issues/1118 there was a range https://github.com/nodejs/node/compare/171a43a98685d5cca6710d2d6bf4d20008de3426...ad4765a32641277a91efb565e2056abaa6b6a70b. Admittedly in this case it was fairly easy for me to pin it down (modulo insomnia induced mixup)

MylesBorins commented 7 years ago

generally I will either ping people on our repo when we add something to flaky or alternatively send a fix. (the latter is preferable )

Trott commented 6 years ago

This issue has been inactive for a while and this repository is now obsolete. I'm going to close this, but feel free to open another issue in a relevant active repository (TSC perhaps?) and include a link back to this issue if this is a subject that should receive continued attention.

nodejs / CTC

Enforce citgm for major changes #126