dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

Application crashes in release mode throwing System.ArgumentOutOfRangeException #100758

Closed daiger14 closed 1 month ago

daiger14 commented 7 months ago

Description

Hello, I'm trying to update our application to NET 8, but the application started to crash when deployed. It crashes only in release mode and when we are calculating big projects. If I Suppress JIT optimization on module load, it works without exception. image The project was working starting from NET Core 2.0

We are not able to understand how and where this exception happens. Suppose I'm checking the index of the array and length, logging via System.Diagnostics.Trace.WriteLine, everything is fine before the exception. image In the screenshot, you can check the index of the array before the line of code throws an exception. The index is 4, the number of elements in the array is 5. Also, you can see the Trace.WriteLine('MATRIX1') before the exception in the Debug output, but the line is after.

Reproduction Steps

I'm not able to create a project which will reproduce this issue. The previous versions of NET were working fine. Release, Debug build doesn't matter. Now(NET 8) it works only in Debug mode or suppressing JIT optimization

Expected behavior

Should not crash the application. There should be a valid stack trace of the exception.

Actual behavior

Crash with "unreal" System.ArgumentOutOfRangeException Application crashes throwing System.ArgumentOutOfRangeException. The exception should show a real call stack.

Regression?

No response

Known Workarounds

Build in debug mode or Suppress JIT optimization on module load

Configuration

NET 8.0.203

Other information

No response

AndyAyersMS commented 6 months ago

@daiger14 we will probably need more information from you to figure out what is going wrong.

Which method is getting an error?

If you add [MethodImplOptions(MethodImpl.NoOptimization)] to that method where the problem is happening, does it go away?

If so, can you share the (unmodified) assembly with us?

cc @dotnet/jit-contrib in case this seems familiar to anyone.

AndyAyersMS commented 6 months ago

@daiger14 please help us understand more about this so we can figure out what is going wrong.

daiger14 commented 6 months ago

@daiger14 please help us understand more about this so we can figure out what is going wrong.

Hi, we are trying to figure out where the method is. Everytime when we placing [MethodImplOptions(MethodImpl.NoOptimization)] It just appears in another file/line of code. Seems like it's not related to some method. Once per 10-20 runs of our big project, application works fine without throwing the exception. Maybe we can do something more to find what is wrong?

AndyAyersMS commented 6 months ago

If the issue can't be pinned down to a specific method, it could mean there is similar code in some number of methods, or there is perhaps some kind of memory corruption happening.

Since you are able to reproduce the problem, can you try running under windbg, and once the exception happens, use the !VerifyHeap command from SOS to check for corruption?

daiger14 commented 6 months ago

I'm not familiar with windbg, but will try tomorrow. Thanks

daiger14 commented 6 months ago

If the issue can't be pinned down to a specific method, it could mean there is similar code in some number of methods, or there is perhaps some kind of memory corruption happening.

Since you are able to reproduce the problem, can you try running under windbg, and once the exception happens, use the !VerifyHeap command from SOS to check for corruption?

Hi, here is the result of !VerifyHeap command (no heap corruption detected): image Adding comments from my colleague who is a developer of the application core: "Observations, conclusions: 1) the exception thrown is: index out of range 2) the exception is moving when adding logging info, or trying to add checks 3) when we succeed to log directly before exception is thrown all relevant values are OK, index, array size, e.g. 0<=index<size 4) the same code is passed many times before exception is thrown 5) no exception if runtime optimization is off 6) the problem happens only with huge data 7) IMHO there is some problem when runtime optimization collects the data, something gets wrong (like counter overflow, etc.), which results in bad optimization"

AndyAyersMS commented 6 months ago

@daiger14 thanks for checking with windbg. Since this seems to happen almost every run, is there any chance we could set up some kind of collaborative debugging session?

@dotnet/jit-contrib any other thoughts on how we could figure out what is going wrong?

EgorBo commented 6 months ago

@dotnet/jit-contrib any other thoughts on how we could figure out what is going wrong?

We've fixed two similar bug reports (https://github.com/dotnet/runtime/issues/96839 and https://github.com/dotnet/runtime/issues/100809) in .net 9.0 and backported to 8.0 (not yet available). Might be related (we can wait for the fix to propagate to 8.0 and check again)

AndyAyersMS commented 6 months ago

Thanks @EgorBo -- I thought this issue sounded familiar but couldn't find those issues. Do you know if both of those fixes will be in 8.0.5?

EgorBo commented 6 months ago

Thanks @EgorBo -- I thought this issue sounded familiar but couldn't find those issues. Do you know if both of those fixes will be in 8.0.5?

Sadly, only in 8.0.6. I think it should be available in the next (if not in current) preview of 9.0 too if that is an option to try

daiger14 commented 6 months ago

@daiger14 thanks for checking with windbg. Since this seems to happen almost every run, is there any chance we could set up some kind of collaborative debugging session?

@dotnet/jit-contrib any other thoughts on how we could figure out what is going wrong?

Hi @AndyAyersMS, about set up a collaborative session, sure we can, just to agree about the time ;) HI @EgorBo, can I somehow check this commits locally. Thank you guys!

AndyAyersMS commented 6 months ago

@EgorBo do you know for sure if the fixes will be in 9.0 Preview 4? Seems likely.

It should be available later this month. @daiger14 simplest thing might be for you to download this once it's available and try... you don't need to rebuild anything, just run it.

Alternatively, you can build a version of the 8.0.6 JIT yourself, or I can build one and make it available to you, and I can tell you how to patch it into an existing 8.0 installation for testing, but I understand if you'd rather not.

EgorBo commented 6 months ago

@EgorBo do you know for sure if the fixes will be in 9.0 Preview 4? Seems likely.

I've just checked - yes, it will be in Preview4

daiger14 commented 6 months ago

@AndyAyersMS, @EgorBo Thank you guys, I will wait for preview 4 and will write the results of the tests.

daiger14 commented 5 months ago

Hi @AndyAyersMS @EgorBo, I checked with version 9.0 preview 4, and the issue is still present and replicable with our project :(

EgorBo commented 5 months ago

Hi @AndyAyersMS @EgorBo, I checked with version 9.0 preview 4, and the issue is still present and replicable with our project :(

That is sad to hear. Unfortunately, it's unlikely we can diagnose this issue without a repro (and presumably a memory dump won't help much here since it sounds like a silent codegen bug).

Can you check if disabling TieredCompilation (or/and TieredPGO) helps to reproduce it more reliably? (it's <TieredCompilation>false</TieredCompilation> or <TieredPGO>false</TieredPGO> properties).

daiger14 commented 5 months ago

@EgorBo Nothing changed turning off TieredCompilation. @AndyAyersMS I found a function on which setting the [MethodImpl(MethodImplOptions.NoOptimization)] avoids the exception. I can share it with you guys, please tell me how I can do this privately. It is okay to use [MethodImpl(MethodImplOptions.NoOptimization)] in production env? Thank you!

AndyAyersMS commented 5 months ago

Yes, you can use [MethodImpl(MethodImplOptions.NoOptimization)] in production if necessary.

To share your code example privately, it is best to open a parallel issue on .NET Community site: https://developercommunity.microsoft.com/dotnet

Once that is set up you can add private attachments.

AndyAyersMS commented 5 months ago

Community issue link: https://developercommunity.microsoft.com/t/Application-crashes-in-release-mode-thro/10673856

AndyAyersMS commented 4 months ago

@daiger14 can we set up an interactive debug session for this?

AndyAyersMS commented 3 months ago

@daiger14 we're still very interested in trying to resolve this.

daiger14 commented 3 months ago

@daiger14 we're still very interested in trying to resolve this.

Hi @AndyAyersMS, @EgorBo I'm no longer working on this project, but I sent the link and all the information to the principal developer of this application. I appreciate your support.

dotnet-policy-service[bot] commented 2 months ago

This issue has been marked needs-author-action and may be missing some important information.

AndyAyersMS commented 2 months ago

@daiger14 we're still very interested in trying to resolve this.

Hi @AndyAyersMS, @EgorBo I'm no longer working on this project, but I sent the link and all the information to the principal developer of this application. I appreciate your support.

Thank you for following up.

AndyAyersMS commented 2 months ago

Still not actionable, so moving to future.

dotnet-policy-service[bot] commented 2 months ago

This issue has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

dotnet-policy-service[bot] commented 1 month ago

This issue will now be closed since it had been marked no-recent-activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the issue, but please note that the issue will be locked if it remains inactive for another 30 days.