dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.98k stars 4.66k forks source link

[wasm][aot] `System.Text.RegularExpressions.Unit.Tests` failing due to `Maximum call stack size exceeded` #66118

Open radical opened 2 years ago

radical commented 2 years ago

Hitting on rolling builds, and PRs. This is when building the tests with AOT, and happens on linux, and windows.

Build, and log:

info: Discovered:  System.Text.RegularExpressions.Unit.Tests.dll (found 13 of 13 test cases)
info: Using random seed for test cases: 1824947855
info: Using random seed for collections: 1824947855
info: Starting:    System.Text.RegularExpressions.Unit.Tests.dll
fail: {}
fail: RangeError: Maximum call stack size exceeded
fail:     at pthread_getspecific (<anonymous>:wasm-function[74267]:0xf7c81e)
fail:     at get_context (<anonymous>:wasm-function[56607]:0xd70b64)
fail:     at interp_sufficient_stack (<anonymous>:wasm-function[56663]:0xd7ee7c)
fail:     at mini_interp_sufficient_stack (<anonymous>:wasm-function[72917]:0xf53721)
fail:     at ves_icall_System_Runtime_CompilerServices_RuntimeHelpers_SufficientExecutionStack (<anonymous>:wasm-function[60228]:0xdec29e)
fail:     at aot_wrapper_corlib_System_dot_Runtime_dot_CompilerServices_System_dot_Runtime_dot_CompilerServices_dot_RuntimeHelpers__SufficientExecutionStack_pinvoke_bool_bool_ (<anonymous>:wasm-function[27205]:0x7454cf)
fail:     at corlib_System_Runtime_CompilerServices_RuntimeHelpers_TryEnsureSufficientExecutionStack (<anonymous>:wasm-function[26554]:0x72be45)
fail:     at System_Text_RegularExpressions_Unit_Tests_System_Threading_StackHelper_TryEnsureSufficientExecutionStack (<anonymous>:wasm-function[44081]:0xa5dcf9)
fail:     at System_Text_RegularExpressions_Unit_Tests_System_Text_RegularExpressions_RegexNode_FindAndMakeLoopsAtomic (<anonymous>:wasm-function[44280]:0xa79a36)
fail:     at System_Text_RegularExpressions_Unit_Tests_System_Text_RegularExpressions_RegexNode_FindAndMakeLoopsAtomic (<anonymous>:wasm-function[44280]:0xa79a8e)
...

cc @radekdoulik @vargaz

ghost commented 2 years ago

Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.

Issue Details
Hitting on rolling builds, and PRs. This is when building the tests with AOT, and happens on linux, and windows. [Build](https://dev.azure.com/dnceng/public/_build/results?buildId=1641292&view=logs&j=55f4943c-f76d-585b-7250-deab324f0a54&t=9d909c2f-e8c7-58b9-4308-45796ed0b70f), and [log](https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-d84710afb40d4a9793/normal-System.Text.RegularExpressions.Unit.Tests/1/console.d76b225b.log?sv=2019-07-07&se=2022-03-22T21%3A55%3A03Z&sr=c&sp=rl&sig=727BzB0WjftXm1dA%2BnDfhPd6xAFlMhNseO4AH4sJ6j0%3D): ``` info: Discovered: System.Text.RegularExpressions.Unit.Tests.dll (found 13 of 13 test cases) info: Using random seed for test cases: 1824947855 info: Using random seed for collections: 1824947855 info: Starting: System.Text.RegularExpressions.Unit.Tests.dll fail: {} fail: RangeError: Maximum call stack size exceeded fail: at pthread_getspecific (:wasm-function[74267]:0xf7c81e) fail: at get_context (:wasm-function[56607]:0xd70b64) fail: at interp_sufficient_stack (:wasm-function[56663]:0xd7ee7c) fail: at mini_interp_sufficient_stack (:wasm-function[72917]:0xf53721) fail: at ves_icall_System_Runtime_CompilerServices_RuntimeHelpers_SufficientExecutionStack (:wasm-function[60228]:0xdec29e) fail: at aot_wrapper_corlib_System_dot_Runtime_dot_CompilerServices_System_dot_Runtime_dot_CompilerServices_dot_RuntimeHelpers__SufficientExecutionStack_pinvoke_bool_bool_ (:wasm-function[27205]:0x7454cf) fail: at corlib_System_Runtime_CompilerServices_RuntimeHelpers_TryEnsureSufficientExecutionStack (:wasm-function[26554]:0x72be45) fail: at System_Text_RegularExpressions_Unit_Tests_System_Threading_StackHelper_TryEnsureSufficientExecutionStack (:wasm-function[44081]:0xa5dcf9) fail: at System_Text_RegularExpressions_Unit_Tests_System_Text_RegularExpressions_RegexNode_FindAndMakeLoopsAtomic (:wasm-function[44280]:0xa79a36) fail: at System_Text_RegularExpressions_Unit_Tests_System_Text_RegularExpressions_RegexNode_FindAndMakeLoopsAtomic (:wasm-function[44280]:0xa79a8e) ... ``` cc @radekdoulik @vargaz
Author: radical
Assignees: -
Labels: `arch-wasm`, `blocking-clean-ci`
Milestone: -
radical commented 2 years ago

This seems to be only relevant commit compared the previous build that passed.

commit 093bdc466693e847dc75b6fd86d895bf603acdee
Author: Stephen Toub <stoub@microsoft.com>
Date:   Wed Mar 2 07:52:37 2022 -0500

    Avoid RegexCode/RegexWriter for all engines other than RegexInterpreter (#65986)

    * Avoid RegexCode/RegexWriter for all engines other than RegexInterpreter

    * Address PR feedback

cc @stephentoub

stephentoub commented 2 years ago

The purpose of TryEnsureSufficientExecutionStack is that code should be able to call it to know when it's too deep on the stack to do what it's about to do, and Regex uses that API to determine whether it can do certain processing. Seems like a mono bug if calling that API is producing some kind of exception about being too deep.

The cited PR also didn't add any code that would trigger this, but it did move some tests around. Was the regex test suite previously being suppressed somewhere in this configuration? If so, its new unit tests project might also need to be suppressed in the same place.

But it also seems like there's a bug here in mono that might have been previously hidden because of such suppression.

vargaz commented 2 years ago

SufficientExecutionStack probably has some problems on wasm. Does this happen on linux+wasm as well ?

radical commented 2 years ago

Yep, linux and windows.

radical commented 2 years ago

AppBundle from my Mac. AppBundle.zip

vargaz commented 2 years ago

So it looks like that on wasm, TryEnsureSufficientExecutionStack cannot really be implemented since the real execution stack is hidden from the program. We can only check that we don't run out of the C stack which is in linear memory. In this case, the 'real' execution stack overflows, and the wasm vm aborts.

vargaz commented 2 years ago

Decreasing the depth in MinMaxLengthIsCorrect_HugeDepth to 5000 works around the problem. The recursion code should have some kind of hardcoded limit imho, FindAndMakeLoopsAtomic () was on the stack 8k times when it crashed.

stephentoub commented 2 years ago

Decreasing the depth in MinMaxLengthIsCorrect_HugeDepth to 5000 works around the problem. The recursion code should have some kind of hardcoded limit imho, FindAndMakeLoopsAtomic () was on the stack 8k times when it crashed.

It's using TryEnsureSufficientExecutionStack to avoid needing a hardcoded limit. This particular test is a stress test validating that behavior is functional.

stephentoub commented 2 years ago

(Multiple places use TryEnsureSufficientExecutionStack for this same purpose, e.g. await for determining whether it should queue long async continuation chains, System.Linq.Expressions, I believe Roslyn in various places, etc.)

lewing commented 2 years ago

(Multiple places use TryEnsureSufficientExecutionStack for this same purpose, e.g. await for determining whether it should queue long async continuation chains, System.Linq.Expressions, I believe Roslyn in various places, etc.)

This is especially tricky for wasm given that different wasm runtimes have different limits and at times use wildly different amounts of execution stack for the same code. For AOT we may need to make a configurable limit based on testing.

stephentoub commented 2 years ago

For AOT we may need to make a configurable limit based on testing.

A configurable limit in TryEnsureSufficientExecutionStack, or a configurable limit everywhere that uses TryEnsureSufficientExecutionStack? I don't think the latter is reasonable in general (even if we were to do so everywhere we use it, we can't expect that to be true for all 3rd-party usage), but if there's really no way to correctly implement TryEnsureSufficientExecutionStack on wasm, imbuing it there with a conservative limit in TryEnsureSufficientExecutionStack seems like an ok fallback.

lewing commented 2 years ago

A limit in TryEnsureSufficientExecutionStack, agreed about the latter. @radekdoulik can you take a look at this.

danmoseley commented 2 years ago

removing 'blocking-clean-ci' as it's not, as zero regex tests are now running on WASM, pending a fix for TryEnsureSufficientExecutionStack.