Closed nojunpark closed 5 years ago
This causes subsequent tests, including bitbar UI tests to not run.
It passes locally for me.
Changing the value for a property with a final value has been deprecated. This will fail with an error in Gradle 6.0.
.
Maybe this is the reason for failure
I also saw this when doing the latest version bump in https://github.com/mozilla-mobile/firefox-tv/pull/2377
Changing the value for a property with a final value has been deprecated. This will fail with an error in Gradle 6.0.
. Maybe this is the reason for failure
I think this is unrelated: it's a deprecation notice in the gradle APIs but theoretically should not indicate broken functionality.
I added --info
to the TC job in https://github.com/mozilla-mobile/firefox-tv/pull/2415 – we should let the PR fail and identify which test it's failing on.
Note that there are two possibilities:
I wonder if this is related to the a-c or dependency upgrade in #2347.
I've retrigger 4+ times and I haven't been able to get #2415 to fail – I wonder if there's a race condition.
I've retrigger 4+ times and I haven't been able to get #2415 to fail – I wonder if there's a race condition.
I still can't get it to fail. Is it still a problem in other PRs?
As for other options for debugging, I wonder if we can reproduce locally using our Docker image.
I removed the tasks testGeckoDebugUnitTest
and testGeckoReleaseUnitTest
and it timed out on testSystemReleaseUnitTest
.
My PR in #2415 failed:
The failure happened on:
org.mozilla.tv.firefox.pocket.PocketVideoRepoTest > GIVEN pocket key is not valid WHEN getting a new instance THEN feed state starts no api key STANDARD_OUT
[Robolectric] org.mozilla.tv.firefox.pocket.PocketVideoRepoTest.GIVEN pocket key is not valid WHEN getting a new instance THEN feed state starts no api key: sdk=28; resources=binary
[taskcluster:error] Task timeout after 3600 seconds. Force killing container.
[taskcluster 2019-06-12 20:17:05.911Z] === Task Finished ===
We could investigate this test or see if there is another task that is taking a long time, making this task run so late this fails.
In this test, specifically:
assertEquals(FeedState.NoAPIKey, repo.feedState.blockingFirst())
Maybe blockingFirst()
is blocking forever? That being said, I'd expect junit to kill the test after some timeout: is that not true?
@mcomella to my knowledge, you have to specify the timeout rule with either the test annotation, or use the Timeout
class. I don't think it comes with default timeout for unit tests (I believe UI tests do have timeout value though)
This link might help: https://github.com/junit-team/junit4/wiki/timeout-for-tests
@dnarcese Did you say you tracked down a solution that, by increasing the heap size of our gradle daemon, this is no longer issue? If so, should we write and land a patch? In any case, increasing the size of the gradle daemon generally seems like a good thing to do, given the increasing size of our project.
@mcomella Yes, it seems that increasing gradle daemon size is fixing the issue. I have been testing in this PR https://github.com/mozilla-mobile/firefox-tv/pull/2431. I will clean up those commits and request review.
Note: dnarcese has a lot of notes on what was tried in #2431.
Next steps: @dnarcese will pair with @mcomella on this.
There is a workaround, but the goal here is to find the underlying cause. Sizing L because it can't be reproed locally, not consistent on TC, takes a long time to run/test. Will also need to communicate w/ other teams (releng, QA) to solve this. Build process is also difficult.
We've been working with bstack to determine if this is a hardware performance issue. bstack's theory is:
fire-tv builds are more resource intensive and/or more frequent than other tasks we've had before such that enough of them are piling up on single workers at once to cause OOMing
He opened a PR to "use more recent instance types and get you on instances where a single build takes the whole machine (probably means your own instance type)": https://phabricator.services.mozilla.com/D36660
We're waiting for that to land and seeing if the intermittent is still a problem. In parallel, we've been trying a few other approaches to see if we can fix the bug in other ways.
In my test PR https://github.com/mozilla-mobile/firefox-tv/pull/2505, I discovered that some tests actually timeout:
org.mozilla.tv.firefox.pocket.PocketVideoRepoTest > GIVEN pocket key is not valid WHEN getting a new instance THEN feed state starts no api key FAILED
org.junit.runners.model.TestTimedOutException at PocketVideoRepoTest.kt:106
org.mozilla.tv.firefox.pocket.PocketVideoRepoTest > GIVEN pocket key is valid WHEN getting a new instance THEN feed state starts inactive FAILED
org.junit.runners.model.TestTimedOutException at PocketVideoRepoTest.kt:100
We should do a quick root cause analysis and, if it'd take too long to fix or find the root cause, consider @Ignore
the tests.
@mcomella Just curious, does it still time out if you increase the heap size too? (I know we have the better aws worker type, wondering heap size is changed as well)
In my test PR #2505, I discovered that some tests actually timeout:
I'm temporarily disabling the intermittently failing tests because it's blocking the current release (4.0.0? v3.12?): https://github.com/mozilla-mobile/firefox-tv/pull/2517
@mcomella will investigate newly ignored tests + root cause and file, and land some mitigation with this bug.
On Fri, Jul 12, 2019 at 11:27 AM liuche notifications@github.com wrote:
@mcomella https://github.com/mcomella will investigate newly ignored tests + root cause and file, and land some mitigation with this bug.
This is https://github.com/mozilla-mobile/firefox-tv/issues/2539 now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mozilla-mobile/firefox-tv/issues/2378?email_source=notifications&email_token=AAFZMTBOADLRIA5B2VXHT5LP7DEKFA5CNFSM4HVCLSC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ2QU6Q#issuecomment-510986874, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFZMTEN43GGNFTGRBWTRN3P7DEKFANCNFSM4HVCLSCQ .
Steps to reproduce
Currently, two latest master branch TC runs have been failing with same reason. https://tools.taskcluster.net/groups/IZyLi8tlS4She-_BKkg7_w/tasks/IZyLi8tlS4She-_BKkg7_w/runs/0/logs/public%2Flogs%2Flive.log