Open github-actions[bot] opened 1 month ago
sending to the experts! https://expensify.slack.com/archives/C035J5C9FAP/p1726004032791809
Investigation in process!
Working on getting the logs. It's not related to the linked PR, but keeping this open as a daily for that investigation
@dangrous Whoops! This issue is 2 days overdue. Let's get this updated quick!
margelo team is on it I believe, in that same slack thread. @kirillzyusko let me know if you want me to assign you here!
@dangrous yeah, feel free to assign me on this!
@dangrous, @kirillzyusko Eep! 4 days overdue now. Issues have feelings too...
It failed because of timeout issue (we hit a limit of 5400s) - 1.5h.
I think we merged a PR https://github.com/Expensify/App/pull/47777 which increases it to 7200 (2h). Do you think we can close the issue?
It looks from the screengrab that it crashed though, right? And that's what caused the timeout since the app never reopened? We should see if we can figure out what that crash was....
@dangrous yeah, you are right, but from my observation:
flashlight
tool);In fact in out e2e tests we allow test to crash 3 times during its 60 runs. And we are relying on this fact. The problem is that when test crashes, then we are waiting 5 mins to force quit it (we have 5 mins timeout for a test). And if we get 2 random failures in any test, it will result in 10 minutes overhead for 1 test-suite. We have 5 test suites, so potentially retrying mechanism can add ~50 minutes for our test run 🤷♂️ And I think because of that we hit a limit in this particular test.
One of the things to optimize it I've been thinking of is reducing the timeout interval (from 5 minutes to 2.5 minutes). But I think we need to ask @hannojg why such relatively big timeout was chosen for e2e tests?
oh okay that makes sense - yeah I feel like we could even go shorter than 2.5 mins - I feel like if something is hanging for more than, say, 1 minute, then something is wrong enough that we should look at it. But curious what @hannojg thinks. Or if he's still OOO I think we can close this in the meantime
Agree, we can definitely make this timeout interval shorter!
Great! @kirillzyusko do you want to put up a PR to drop that timeout, maybe start with 2.5 mins and we see how that one goes? Probably could go even shorter but maybe that's a good starting point
Kiryl is OOO, and will be back next week to pick this one up!
@dangrous, @kirillzyusko Uh oh! This issue is overdue by 2 days. Don't forget to update your issues!
@dangrous, @kirillzyusko 6 days overdue. This is scarier than being forced to listen to Vogon poetry!
@kirillzyusko let us know when you're back and can knock out the timeout adjustment!
@dangrous here is a PR: https://github.com/Expensify/App/pull/50512 👀
🚨 Failure Summary 🚨:
⚠️ Action Required ⚠️:
🛠️ A recent merge appears to have caused a failure in the job named e2ePerformanceTests / Run E2E tests in AWS device farm. This issue has been automatically created and labeled with
Workflow Failure
for investigation.👀 Please look into the following:
🐛 We appreciate your help in squashing this bug!
Issue Owner
Current Issue Owner: @kirillzyusko