fedimint / fedimint

Federated E-Cash Mint
https://fedimint.org/
MIT License
536 stars 209 forks source link

chore: anti-flakiness spring offensive #4278

Closed dpc closed 3 months ago

dpc commented 3 months ago

This has been on my mind for a long while: Run the tests in the merge queue multiple times to make detecting flakiness easier.

Currently there's some issue surfaced with #4250 , so temporarily disable random peer failures.

codecov[bot] commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (204d419) 58.13% compared to head (52f0c96) 58.06%. Report is 18 commits behind head on master.

:exclamation: Current head 52f0c96 differs from pull request most recent head 77590b3. Consider uploading reports for the commit 77590b3 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #4278 +/- ## ========================================== - Coverage 58.13% 58.06% -0.07% ========================================== Files 197 197 Lines 43755 43755 ========================================== - Hits 25435 25405 -30 - Misses 18320 18350 +30 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

bradleystachurski commented 3 months ago

Might be worth surfacing: I'd anticipate this will add ~30 minutes in CI for merges

I'd assume you're implicitly okay with that, just want to surface.

justinmoon commented 3 months ago

One thing I've been wondering about -- would there be any way to automatically record when a specific test flakes? Basically automate what we do manually in https://github.com/fedimint/fedimint/issues/2825.

dpc commented 3 months ago

One thing I've been wondering about -- would there be any way to automatically record when a specific test flakes?

The hope is that it won't be necessary because everything will keep failing all the time.

maan2003 commented 3 months ago

lets see if it passed the merge queue

maan2003 commented 3 months ago

is this "merge-base changed" bugged? I see no changes

bradleystachurski commented 3 months ago

This does seem quite buggy now. @dpc think it's worth rebasing so https://github.com/fedimint/fedimint/commit/25acaa101a744509a1660aa3893c51eec8ae2190 is no longer a part of this PR?

dpc commented 3 months ago

Rebased.

As for time it takes - easy to lower.

maan2003 commented 3 months ago

might be interesting to mark some tests as flaky and merge this PR, so we don't introduce any new flakes https://nexte.st/book/retries.html

dpc commented 3 months ago

might be interesting to mark some tests as flaky and merge this PR, so we don't introduce any new flakes https://nexte.st/book/retries.html

Oh. Never used that. Yeah, we can do that. But in practice it will mean giving up on fixing the problem, as no one is paying attention to test outputs in the CI. Better than nothing though.

maan2003 commented 3 months ago

fedimint-workspace-lcov-ci> 2024-02-09T22:55:38.328326Z WARN new:task{name="listen task"}: net::peer: Error while opening incoming connection mint=PeerId(3) err=Randomly failed

didn't we remove random failures

dpc commented 3 months ago

didn't we remove random failures

I think there might be two types of random failures and one is one tests, on the others.

dpc commented 3 months ago

Pushed a fix: we didn't run tests on macos, so we shouldn't run 5 tests either.

dpc commented 3 months ago

After this commit, should we undo running coverage tests 5 times? Seems like this should have it covered.

There might be some differences in the run time performance due to ccov overhead, so maybe if could expose some stuff that doesn't show up here, but sure, we can tweak all we want.

elsirion commented 3 months ago

Aaaaaaand it flaked again :see_no_evil:

dpc commented 3 months ago

Aaaaaaand it flaked again 🙈

I think it was one of the fixes I already had but didn't push with it yet. Let's try again.