facebook / lexical

Lexical is an extensible text editor framework that provides excellent reliability, accessibility and performance.
https://lexical.dev
MIT License
17.61k stars 1.47k forks source link

Only run CI e2e tests on approved PRs #6080

Closed necolas closed 2 weeks ago

necolas commented 3 weeks ago

Avoid running the entire suite of (expensive) e2e tests on every PR that is opened or updated. Only run if PR is approved.

Reference: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#running-a-workflow-when-a-pull-request-is-approved

vercel[bot] commented 3 weeks ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
lexical ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 12, 2024 10:18pm
lexical-playground ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 12, 2024 10:18pm
etrepum commented 3 weeks ago

It might be nicer to move the e2e tests to a separate workflow file, and maybe also do a single run of e2e tests in the test.yml workflow and a more comprehensive matrix on approval

necolas commented 3 weeks ago

Yes, I floated some similar ideas as well. At the moment it's assumed that all the e2e tests are being run every time a PR is updated, and every time a merge to main happens. Lexical is way over its CI budget. As a first step, I'd like to validate that a relatively small and quick change can demonstrate reduction in CI costs. If we can do that, I think further organization and optimization sounds great.

etrepum commented 3 weeks ago

I suggested the separate workflow file because it would be a smaller logical change since the approval stuff could be floated out to one place instead of N different places

etrepum commented 3 weeks ago

but maybe my understanding of the schema is a bit off, I haven't tried to do it. looks good either way!

necolas commented 3 weeks ago

Haven't seen this before; cool feature.

image

It looks like some open PRs haven't run the e2e tests. Hard to tell why some open PR have or haven't run all tests.

I suggested the separate workflow file because it would be a smaller logical change since the approval stuff could be floated out to one place instead of N different places

For sure. Same goes for the github.repository_owner == 'facebook' condition

acywatson commented 3 weeks ago

I agree with this in spirit, but in practice it's gonna slow us down significantly to have to get a stamp before we can see the CI run outcome.

Is there a way to make an exception for codeowners or specific users? (i.e., something like: if it's a maintainer, run on submit, otherwise, run on approve.)

acywatson commented 3 weeks ago

It looks like some open PRs haven't run the e2e tests. Hard to tell why some open PR have or haven't run all tests.

I believe this is because of the budget issue

etrepum commented 3 weeks ago

I think what would be budget and velocity friendly would be to run a very limited matrix of e2e before approval (but not in draft), e.g. mac chromium or whatever is cheapest, and then do the full suite later on approval or some other condition that doesn't happen on every commit for every contributor. If there was some parameters around how much these things cost and what the budget are the scope could be widened a bit to cover more combinations (e.g. mac+chromium and mac+chromium+collab). We could even consider ditching some things like legacy events altogether unless that's still important enough to worry about.

Breaking the build merely by contributing too much isn't something I anticipated 😆

necolas commented 3 weeks ago

Is there a way to make an exception for codeowners or specific users?

There is a way to do that. I looked through recent commits and PRs, and it looks like most of them are created by Meta employees or Lexical collaborators - it would limit the budget impact to exclude maintainers.

I think what @etrepum suggested could be a good balance, e.g., you decide one or two e2e suites to run always, and the rest are deferred until PR approval. Later we could try to exclude more PRs (like this one) from running e2e tests at all; that would help allocate more budget to the PRs that benefit most.

acywatson commented 3 weeks ago

I think what @etrepum suggested could be a good balance

Yea, this is a good idea.

necolas commented 3 weeks ago

Once the existing CI workflow is back up and running, I'm happy to keep iterating on feedback until you're happy it doesn't negatively impact development.

github-actions[bot] commented 3 weeks ago

size-limit report 📦

Path Size Loading time (3g) Running time (snapdragon) Total time
packages/lexical/dist/Lexical.js 23.93 KB (0%) 479 ms (0%) 344 ms (+27.46% 🔺) 823 ms
packages/lexical-rich-text/dist/LexicalRichText.js 34.56 KB (0%) 692 ms (0%) 1.2 s (+18.6% 🔺) 1.9 s
packages/lexical-plain-text/dist/LexicalPlainText.js 34.51 KB (0%) 691 ms (0%) 1.1 s (+5.2% 🔺) 1.8 s
etrepum commented 3 weeks ago

looks like the ci-check failure is prettier, it is picky about the formatting of those yml files

necolas commented 3 weeks ago

This diff successfully ran only the windows e2e matrix.

acywatson commented 2 weeks ago

This diff successfully ran only the windows e2e matrix.

I think this PR is correct, but I'm a little confused that you're saying that the windows e2e matrix is running. It looks to me like the macOS matrix is running via e2e-canary.

My understanding of the intent is that we run this smaller batch of tests againt macOS/chromium commit, then run everything else on approval. Is that right?

acywatson commented 2 weeks ago

I think this PR is correct

I may have spoken too soon - if we want the rest of the test suite to run upon approval, we probably need to change this line to trigger the test.yml workflow on approval:

https://github.com/facebook/lexical/blob/83fd6fa2bb48c5efa3c9c96e8c52dfa8faf4ae51/.github/workflows/test.yml#L10

Probably just need to add "submitted" pull_request_review trigger.

https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#running-a-workflow-when-a-pull-request-is-approved

etrepum commented 2 weeks ago

I may have spoken too soon - if we want the rest of the test suite to run upon approval, we probably need to change this line to trigger the test.yml workflow on approval:

https://github.com/facebook/lexical/blob/83fd6fa2bb48c5efa3c9c96e8c52dfa8faf4ae51/.github/workflows/test.yml#L10

Probably just need to add "submitted"

https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#running-a-workflow-when-a-pull-request-is-approved

I think this also means that the cheaper tests and canary will run again when the button is clicked. IMO a separate workflow file to contain the tests we don't want to run unconditionally really does make sense if we're going to keep iterating on this before merge

acywatson commented 2 weeks ago

I think this also means that the cheaper tests and canary will run again when the button is clicked.

Yea, this also occurred to me. I think it's an improvement either way, but your suggestion does seem optimal here (i.e., just run the stuff we didn't run and don't re-run stuff that already passed).

necolas commented 2 weeks ago

As I mentioned, I switched to only run a single e2e test by default instead of windows to keep costs down. Windows still ran 8 tests which wouldn't be enough to get your spend under control. All of the e2e tests except the canary should run on approval - if you approve the PR we can check that. Can't tell if it works because you requested changes.

I'm happy to iterate on how this is organized and optimized once we get the CI costs down from 10x where they should be. Some of the suggestions from reviewers probably aren't right because commits after approval still need tests to run. Let's try to land a first pass that works in a general sense and go from there

acywatson commented 2 weeks ago

if you approve the PR we can check that. Can't tell if it works because you requested changes.

I did and they didn't (run). I requested changes to dismiss my approval so I can test it again once it's fixed.

Screenshot 2024-05-11 at 4 54 03 PM

I'm happy to iterate on how this is organized and optimized once we get the CI costs down from 10x where they should be.

Sure, if you can fix the fact that all the tests don't run on approval, then I'm happy to merge this and iterate whenever you want. Right now, it's not mergeable because the tests don't run on approval as they should - I provided guidance on how to make that happen.

etrepum commented 2 weeks ago

With regard to my previous comment, I am saying that I believe that once this is fixed, if it's done in the same workflow file, the tests will probably run again after the approval button is clicked because it's a new event for the same commit and the workflow file will catch all of them and run through all of the jobs. My understanding is that it will run everything it did before, plus the things it did not run before. In that scenario we are running the 10% (before approval) + 100% (after approval) = 110% of the tests, instead of the ideal 10% (before approval) + 90% (after approval) = 100% of the tests.

I think it would be fine to get rid of the contributor conditions entirely and have a set of more expensive e2e tests in one workflow file that run only on approval (and probably not bother with that expensive workflow on push to main altogether), and the cheaper canary/unit run on push and PR.

necolas commented 2 weeks ago

I separated the core tests and e2e tests into additional workflow calls. In this iteration, the core tests and canary e2e test will always run on PRs and pushes to main. If you'd prefer a different workflow for main pushes, let me know. The extended e2e test suite is only run if a specific label (pick a name) is added to the PR by someone with triage access.

Push to Main / PR

Add Label

Using only reviewer-approval as a trigger has some issues. It's hard to test - you can't approve your own PRs. Reviewers have to approve a PR before they know if all tests. And it may have issues cancelling concurrent workflows if additional reviews comments occur after approval.

I think using a label as a trigger is a bit more flexible. We could still automate running the extended tests, e.g., workflow to add the label if a collaborator approves the diff.

necolas commented 2 weeks ago

I mocked the e2e test workflow to make iterating / reviewing faster and cheaper.

PR skips extended tests by default:

image

Added approved "extended-tests" label (name TBD) and extended tests run:

image

Removed label and updated diff, extended tests don't run:

image
acywatson commented 2 weeks ago

We're waiting on a commit to unmock before we merge, right?

necolas commented 2 weeks ago

We're waiting on a commit to unmock before we merge, right?

Mock removed.

Thanks for working on this, @necolas

No worries. The Lexical OSS project has some cool features I learnt about too. Very impressed with what you've all done here