dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.06k stars 4.69k forks source link

Sporadic failures in RegexMatchTests test cases on s390x #88418

Open uweigand opened 1 year ago

uweigand commented 1 year ago

Hi @stephentoub , we're now seeing failures in the System.Text.RegularExpressions.Tests.RegexMatchTests.Match_TestThatTimeoutHappens test case on s390x, see e.g. here: https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-fb56ebe47ad240c79f/System.Text.RegularExpressions.Tests/1/console.86e3fbb5.log?helixlogtype=result

Unfortunately, these seem to be nondeterministic - in some CI runs the test passes, in others it fails. I'm not sure if this related to this PR or any of your other recent RegEx changes in the first place, but I cannot recall seeing this particular failure before about mid-April. When running the test locally on my system, so far I was completely unable to reproduce the failure, so I'm not sure how to start debugging this problem.

If we see the failure, it seems to be in either the Compiler or SourceGenerator flavor of the test case. Also, given the total run time shown in the CI logs, the problem doesn't appear to be that the regex test runs a long time and the timeout just doesn't trigger, but rather that the test completes quickly. (I don't know if the regex also matches correctly or not - the test doesn't seem to verify that.)

Do you have any ideas what could cause this failure? Or any suggestions on how to debug this? Thanks for your help!

Originally posted by @uweigand in https://github.com/dotnet/runtime/issues/84370#issuecomment-1540073531

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions See info in area-owners.md if you want to be subscribed.

Issue Details
Hi @stephentoub , we're now seeing failures in the `System.Text.RegularExpressions.Tests.RegexMatchTests.Match_TestThatTimeoutHappens` test case on s390x, see e.g. here: https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-fb56ebe47ad240c79f/System.Text.RegularExpressions.Tests/1/console.86e3fbb5.log?helixlogtype=result Unfortunately, these seem to be nondeterministic - in some CI runs the test passes, in others it fails. I'm not sure if this related to this PR or any of your other recent RegEx changes in the first place, but I cannot recall seeing this particular failure before about mid-April. When running the test locally on my system, so far I was completely unable to reproduce the failure, so I'm not sure how to start debugging this problem. If we see the failure, it seems to be in either the `Compiler` or `SourceGenerator` flavor of the test case. Also, given the total run time shown in the CI logs, the problem doesn't appear to be that the regex test runs a long time and the timeout just doesn't trigger, but rather that the test completes quickly. (I don't know if the regex also matches *correctly* or not - the test doesn't seem to verify that.) Do you have any ideas what could cause this failure? Or any suggestions on how to debug this? Thanks for your help! _Originally posted by @uweigand in https://github.com/dotnet/runtime/issues/84370#issuecomment-1540073531_
Author: uweigand
Assignees: -
Labels: `area-System.Text.RegularExpressions`
Milestone: -
uweigand commented 1 year ago

@stephentoub wrote:

@uweigand, sorry for the delay in responding.

Are you still seeing this?

Right, this test is validating that when the processing of the regex takes long enough, a timeout exception gets thrown. I can't see the failure you cited anymore, but presumably the processing of the regex just happened so fast that it didn't time out and thus the test failed. I don't have an explanation for why that would be, though.

I'm no longer able to reply on the original PR as it has been locked, so I've opened this new issue.

The problem described above still occurs sporadically, last time I saw it June 27: https://dev.azure.com/dnceng-public/public/_build/results?buildId=322772&view=results

This time the failing test is System.Text.RegularExpressions.Tests.RegexMatchTests.Match_Timeout_Throws, but this appears to be the same symptom.

(As of about a week ago, all CI tests are failing due to an unrelated issue described here: https://github.com/dotnet/runtime/pull/88417, which might have masked more recent occurrences.)

stephentoub commented 1 year ago

Thanks. Is it always the same arguments failing? That test is a theory, with most of the inputs passing in that run, and two failures:

    System.Text.RegularExpressions.Tests.RegexMatchTests.Match_Timeout_Throws(engine: Compiled, pattern: "((?!(?>[^a]*))a)+", input: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"...) [FAIL]
      Assert.Throws() Failure
      Expected: typeof(System.Text.RegularExpressions.RegexMatchTimeoutException)
      Actual:   (No exception was thrown)
      Stack Trace:
        /_/src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Match.Tests.cs(1258,0): at System.Text.RegularExpressions.Tests.RegexMatchTests.Match_Timeout_Throws(RegexEngine engine, String pattern, String input)
        --- End of stack trace from previous location ---
    System.Text.RegularExpressions.Tests.RegexMatchTests.Match_Timeout_Throws(engine: Compiled, pattern: "((?<!(?>[^a]*))a)+", input: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"...) [FAIL]
      Assert.Throws() Failure
      Expected: typeof(System.Text.RegularExpressions.RegexMatchTimeoutException)
      Actual:   (No exception was thrown)
      Stack Trace:
        /_/src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Match.Tests.cs(1258,0): at System.Text.RegularExpressions.Tests.RegexMatchTests.Match_Timeout_Throws(RegexEngine engine, String pattern, String input)
        --- End of stack trace from previous location ---
uweigand commented 1 year ago

In addition to the June 27 failure I also see this failure on June 26, which appears to be the same pattern:

 System.Text.RegularExpressions.Tests.RegexMatchTests.Match_Timeout_Throws(engine: Compiled, pattern: "((?<!(?>[^a]*))a)+", input: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"...) [FAIL]

I'm not seeing any other instance of this failure in the CI history (seems to be preserved for about a month). [ I've never been able to reproduce the issue locally, it only happens in the CI if at all. ]

steveharter commented 1 year ago

Flagging as bug for now

steveharter commented 1 year ago

The test has a short-circuit for non-backtracking check: https://github.com/dotnet/runtime/blob/ecb3b2003c624ae1f116915e9f0f0b11ae264094/src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Match.Tests.cs#L1222-L1227 and thus a timeout may not occur.

Flagging as test bug and moving to V9.

stephentoub commented 5 months ago

The test has a short-circuit for non-backtracking check:

https://github.com/dotnet/runtime/blob/ecb3b2003c624ae1f116915e9f0f0b11ae264094/src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Match.Tests.cs#L1222-L1227

and thus a timeout may not occur. Flagging as test bug and moving to V9.

The cited failures are all for engine: Compiled.