Closed notandyvee closed 1 month ago
1 Warning | |
---|---|
:warning: | PR is not assigned to a milestone. |
Generated by :no_entry_sign: Danger
App Name | WordPress | |
Flavor | Jalapeno | |
Build Type | Debug | |
Version | pr20517-cd5d8d9 | |
Commit | cd5d8d9beaeeda109cd037e3f4acbb34e3beb9f1 | |
Direct Download | wordpress-prototype-build-pr20517-cd5d8d9.apk |
App Name | Jetpack | |
Flavor | Jalapeno | |
Build Type | Debug | |
Version | pr20517-cd5d8d9 | |
Commit | cd5d8d9beaeeda109cd037e3f4acbb34e3beb9f1 | |
Direct Download | jetpack-prototype-build-pr20517-cd5d8d9.apk |
I'm thinking about whether adding a retry mechanism to UI tests is the right approach. 🤔 This might prevent us from detecting real issues in production.
Do you know the root cause of flakiness in the e2eAllDayStatsLoad()
test, @notandyvee?
I'm thinking about whether adding a retry mechanism to UI tests is the right approach
UI tests can be flaky. Some more than others. I can't reproduce the issue. Seems to only happen in UI tests tbh. But, please note this does not mask the issue. If it's a legitimate issue the test will fail and the failure will show. While running all of the UI tests I noticed other failures completely unrelated. The nature of UI tests. If we think it's not worth it, I'm fine closing.
edit:
Do you know the root cause of flakiness in the e2eAllDayStatsLoad() test
I can't reproduce. But see this internal convo for some more context: p1710340006238599/1709544863.226669-slack-C5ALGJYEP
Thoughts @pachlava ?
I'm thinking about whether adding a retry mechanism to UI tests is the right approach.
@irfano 👋 On iOS, the default retry for UI tests is 2 across several apps, so this approach is in use. UI tests indeed might be flaky, especially on CI (Firebase), where the performance is much lower, and in some cases our attempts to reproduce the fails locally end with nothing.
Thoughts @pachlava ?
Having a retry mechanism in place is a good thing, IMO - thank you for this, @notandyvee 🙇
However, with this particular test, it's not about its flakiness. It's about the app not loading certain part of the stats screen, and the test rightfully failing, because it fails to find these parts. This happens only in JP, and I could never reproduce this on a real device. One could think that this is about UI tests environment (different config / using WireMock) causing this, but then this would also occur in WP. This never occurred in WP.
It'll be hard to find a discussion, because it's at least one year old, but the issue about Emulator not always loading the Days stats card in JP is a known issue, but of such a low severity (it affects 0 users to my knowledge), that it was not handled. Taking this all into account, I have no objections to at least trying this test with a rerun.
Seems to only happen in UI tests tbh.
UI tests indeed might be flaky, especially on CI (Firebase), where the performance is much lower, and in some cases our attempts to reproduce the fails locally end with nothing.
If the reason for flakiness is the testing environment or emulator performance, then rerun makes sense.
Also, Buildkite mentioned rerun failing tests in its document but it recommends it after the Identify the cause of failure and fix it
step. That's why I questioned adding rerun ability.
One could think that this is about UI tests environment (different config / using WireMock) causing this, but then this would also occur in WP. This never occurred in WP.
@pachlava, this is expected because stats is disabled on WP, so StatsGranularTabsTest
always passes for WP.
It'll be hard to find a discussion, because it's at least one year old, but the issue about Emulator not always loading the Days stats card in JP is a known issue, but of such a low severity (it affects 0 users to my knowledge), that it was not handled.
I could reproduce https://github.com/wordpress-mobile/WordPress-Android/issues/18065#issuecomment-2021562993 on a real device with a real site. So, I think this is a real issue and should not be masked with a rerun. If you agree, we can close this PR. WDYT, @pachlava, @notandyvee?
@pachlava, this is expected because stats is disabled on WP, so StatsGranularTabsTest always passes for WP.
Sorry for not providing more background @irfano. Because we mock the API, the stats are actually available in WP even now, when used with WireMock. And this test never fails in WP because of not-loading card, as well it never failed when stats were still 'officially' present in WP. That's the whole reason why this test was enabled for WP and disabled for JP - we knew the loading issue occurs in JP only, and even though we knew that running this stats test in WP is fully artificial, we decided that it's better to have the test running there this way, than not running it at all. It was so until recently, when @notandyvee toggled the execution rule (at that time the test was disabled for both, but because of a different reason I believe). Here's a video I made today, with the latest WP from trunk
, running against a mocked API:
This is the expected behaviour I believe, and not a bug, because this happens under a mocked API only.
In any case, since you were able to reproduce the issue on a real device (which is very nice ⭐), it will hopefully get fixed
I could reproduce https://github.com/wordpress-mobile/WordPress-Android/issues/18065#issuecomment-2021562993 on a real device with a real site. So, I think this is a real issue and should not be masked with a rerun. If you agree, we can close this PR. WDYT, @pachlava, @notandyvee?
First, I think we should merge this regardless @irfano. I can remove the @Retry
on the test itself if you want. Super useful to have this retry mechanism. Just my opinion.
But I also started to investigate this problem. And figured out how to make it happen reliably. I am trying to find the root cause. In the meantime I will add a needs revision flag while i finish the investigation.
First, I think we should merge this regardless @irfano. I can remove the @Retry on the test itself if you want. Super useful to have this retry mechanism. Just my opinion.
Good decision, I agree! 👍🏻 We can merge this after removing @Retry
from e2eAllDayStatsLoad()
.
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
No data about Duplication
I went ahead and removed the @Retry
for now and re-added the ignore since I am investigating the root cause. The linked issue is done though.
Fixes #20461
We had a long running flaky tests. The reasoning for the fail is unclear and it was turned off (https://github.com/wordpress-mobile/WordPress-Android/pull/20493). In general retrying the test appeared to make it pass. Instead of ignoring the test we should attempt to auto retry running the failing tests since odds are it would pass a second time.
To Test:
Regression Notes
Potential unintended areas of impact
N/A
What I did to test those areas of impact (or what existing automated tests I relied on)
Ran other tests.
What automated tests I added (or what prevented me from doing so)
Added TestRule to fix ignored test.
PR Submission Checklist:
RELEASE-NOTES.txt
if necessary.