dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.95k stars 4.65k forks source link

System.Runtime.Tests.WorkItemExecution "crash" on tvOS #106319

Open CarnaViire opened 1 month ago

CarnaViire commented 1 month ago

The whole Helix WorkItem shows as "crashed", but there's actually just a failed test (System.Tests.DecimalTests+BigIntegerAdd.Test) which is already tracked in https://github.com/dotnet/runtime/issues/106256. It is impossible to see which test is failing in the available CI logs though, I can only see that in Test Results page in AzDO.

The actual Known Issue is properly linked to the PR, but the Build Analysis is red, as the "crash" is considered to be a separate issue. cc @JulieLeeMSFT FYI

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=773271 Build error leg or test failing: System.Runtime.Tests.WorkItemExecution Pull request: https://github.com/dotnet/runtime/pull/106269

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": ["--target tvos-device", "System.Runtime.Tests", "XHarness exit code: 1 (TESTS_FAILED)"],
  "ErrorPattern": "",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=773271 Error message validated: [--target tvos-device System.Runtime.Tests XHarness exit code: 1 (TESTS_FAILED)] Result validation: :white_check_mark: Known issue matched with the provided build. Validation performed at: 8/13/2024 12:41:45 PM UTC

Report

Build Definition Test Pull Request
788488 dotnet/runtime System.Runtime.Tests.WorkItemExecution dotnet/runtime#106745

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 1
dotnet-policy-service[bot] commented 1 month ago

Tagging subscribers to 'os-tvos': @vitek-karas, @kotlarmilos, @ivanpovazan, @steveisok, @akoeplinger See info in area-owners.md if you want to be subscribed.

dotnet-policy-service[bot] commented 1 month ago

Tagging subscribers to this area: @dotnet/area-system-runtime See info in area-owners.md if you want to be subscribed.

dotnet-policy-service[bot] commented 1 month ago

Tagging subscribers to this area: @lambdageek, @steveisok See info in area-owners.md if you want to be subscribed.

matouskozak commented 1 month ago

The failing test can be found thru the net.dot... log file in the artifacts tab like this @CarnaViire:

image

I don't understand why it required this PR to get the BuildAnalysis to green because the PR got correctly matched to the original tracking issue https://github.com/dotnet/runtime/issues/106256

image

My guess is that it considered the System.Tests.DecimalTests+BigIntegerAdd.Test as a separate test from the System.Runtime.Tests even though the BigInteger is a part of the System.Runtime.Tests suit. @missymessa do you know if this is expected behavior of BA?

missymessa commented 1 month ago

@matouskozak when the error is in Helix we just have one log per workitem, not individual logs for test, and we read the whole log, so it can match all the tests failing in the workitem

matouskozak commented 1 month ago

@matouskozak when the error is in Helix we just have one log per workitem, not individual logs for test, and we read the whole log, so it can match all the tests failing in the workitem

Do you know then why it wasn't sufficient that it matched the failing workitem in https://github.com/dotnet/runtime/issues/106256 and this Known Build Error issue was required?

The Test tab shows: image but the job shows only a single workitem image

Perhaps it's something wrong in Mobile/XHarness reporting?

AlitzelMendez commented 1 month ago

can you expand more on: it wasn't sufficient to match the workitem? were you still having the error on the pull request? I see the build: 773271 in both issues, so I want to know which was the gap

matouskozak commented 4 weeks ago

can you expand more on: it wasn't sufficient to match the workitem? were you still having the error on the pull request? I see the build: 773271 in both issues, so I want to know which was the gap

I was just going based on the issue description saying:

The actual Known Issue is properly linked to the PR, but the Build Analysis is red, as the "crash" is considered to be a separate issue.

Based on my understanding, the System.Tests.DecimalTests+BigIntegerAdd.Test failure got matched correctly with https://github.com/dotnet/runtime/issues/106256 but the System.Runtime.Tests remained unmatched even though the System.Tests.DecimalTests+BigIntegerAdd.Test is a part of System.Runtime.Tests. Consequently, they had to create this separate issue to match System.Runtime.Tests failure to make BA green. @CarnaViire do you have more context on this please?

CarnaViire commented 4 weeks ago

Based on my understanding, the System.Tests.DecimalTests+BigIntegerAdd.Test failure got matched correctly with https://github.com/dotnet/runtime/issues/106256 but the System.Runtime.Tests remained unmatched even though the System.Tests.DecimalTests+BigIntegerAdd.Test is a part of System.Runtime.Tests. Consequently, they had to create this separate issue to match System.Runtime.Tests failure to make BA green.

@matouskozak @AlitzelMendez Yes, that's exactly what had happened.

After the CI executed, BA was red, and I could see that there are:

Because the namespace doesn't match (it is System.Tests.DecimalTests and not System.Runtime.Tests.DecimalTests), it was not even clear to me whether these two are related or not. 🥲

So I went to the logs to try to see what it was about. I usually can navigate to specific test run log through the pipeline Job logs (esp. when tests are failing, the logs are usually nicely linked from there) -- but I guess that's not the case with mobile platforms. So the only available log there was the WorkItem log, which didn't say much to me, beside that it is not a crash 😅

And as I mentioned, I wasn't able to find the actual test logs (a bit more on that is below under the cut). I actually had to search the source code to verify that System.Tests.DecimalTests+BigIntegerAdd.Test actually IS part of System.Runtime.Tests, so it was really the failing test in question. 😅

So yeah, after that I could either skip the BA validation, since I've found the root cause and it's already tracked -- or still open the issue; and this false-negative looked inconvenient enough so I thought it's better to raise awareness.


Re: test logs location BTW > The failing test can be found thru the net.dot... log file in the artifacts tab like this The tab wasn't (and still isn't) loading for me in this case for some reason. (It did load for other libs on other runs, so it was working in general) ![image](https://github.com/user-attachments/assets/9e58a18b-ba87-4014-90bb-82a6e6df7177) UPD: Yay, it actually _started_ loading after reloading the page multiple times and some other very random actions I did 🤦‍♀️ (note that I was so persistent only because now I _knew_ they should be there 😅) So i don't know what exactly "fixed" it, what the problem was, and why it persisted from before (since Aug 13, through computer and browser restarts, through staying on the page and waiting) until now. 🤷‍♀️