dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.97k stars 4.66k forks source link

Deadlock in Blazor WASM + CosmoClient that happens in .net7 but not .net6 #79704

Open y2kbugger opened 1 year ago

y2kbugger commented 1 year ago

Is there an existing issue for this?

Describe the bug

See: https://github.com/Azure/azure-cosmos-dotnet-v3/issues/3599

When newing up a CosmosClient (either as DI services or in async callback on a page) the entire browser tab will deadlock.

This does not occur when I retarget the app to .net6

The CosmosSDK would like to help, but I lack the knowledge of how to pinpoint the deadlock since Blazor WASM doesnt appear to have profiling.

Expected Behavior

No deadlock

Steps To Reproduce

https://github.com/y2kbugger/ReproduceDeadlockIssueBlazorWasmNet7

Goto the counter page and put in the key

TxzSgDHnH3yz8WsQ0bdoElUPgrPz98Bk0OCGTMG70fEz77dAeCLEVjKmqW71JIgEVvqrN0be4RtQACDbiZyfLQ==

and click the button.

Notice that it makes it past the final print statement before deadlocking.

Exceptions (if any)

No response

.NET Version

7.0.100

Anything else?

No response

dotnet-issue-labeler[bot] commented 1 year ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

javiercn commented 1 year ago

@lewing @thaystg any idea on how to best approach this?

thaystg commented 1 year ago

@y2kbugger does it happen also if you are not debugging?

y2kbugger commented 1 year ago

@thaystg if dotnet run is "debugging enabled" is there a way to run a production release locally?

I've tried configuring as "Production" image

and it shows it took in the terminal: image

But in the console it is still seems like debugging is enable when I look at the developer console

Debugging hotkey: Shift+Alt+D (when application has focus)
javiercn commented 1 year ago

@y2kbugger publish your app and run the published output

thaystg commented 1 year ago

Oh, okay, for me the question is answered, you are running outside VS, so debugger is not attached. Thanks.

y2kbugger commented 1 year ago

And just for completeness I served a published app and it still occurs the same way.

y2kbugger commented 1 year ago

I noticed they have some workarounds for previous deadlock issues that deal occur then there is no debugger attached. https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos/src/CosmosClient.cs#L144 Maybe something changed between net6 and net7 either in runtime or aspnetcore that broke the workaround?

ghost commented 1 year ago

Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.

Issue Details
### Is there an existing issue for this? - [X] I have searched the existing issues ### Describe the bug See: https://github.com/Azure/azure-cosmos-dotnet-v3/issues/3599 When newing up a CosmosClient (either as DI services or in async callback on a page) the entire browser tab will deadlock. This does not occur when I retarget the app to .net6 The CosmosSDK would like to help, but I lack the knowledge of how to pinpoint the deadlock since Blazor WASM [doesnt appear to have profiling.](https://github.com/dotnet/runtime/issues/76316) ### Expected Behavior No deadlock ### Steps To Reproduce https://github.com/y2kbugger/ReproduceDeadlockIssueBlazorWasmNet7 Goto the counter page and put in the key TxzSgDHnH3yz8WsQ0bdoElUPgrPz98Bk0OCGTMG70fEz77dAeCLEVjKmqW71JIgEVvqrN0be4RtQACDbiZyfLQ== and click the button. Notice that it makes it past the final print statement before deadlocking. ### Exceptions (if any) _No response_ ### .NET Version 7.0.100 ### Anything else? _No response_
Author: y2kbugger
Assignees: -
Labels: `arch-wasm`, `untriaged`, `area-VM-meta-mono`
Milestone: -
pavelsavara commented 1 year ago

https://throwawaycosmostest1234555asdf.documents.azure.com/ is gone by now. Does the problem still apply to latest net8 preview ?

ghost commented 1 year ago

This issue has been marked needs-author-action and may be missing some important information.

ghost commented 1 year ago

This issue has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

y2kbugger commented 1 year ago

I've stood up a new cosmos client. Here is the new key: QL1cYvWmgVoSKMfwGyKl0Ojro1FwZCpcdpT6cK9cDLzFZJgMVKfVtvfyGr2RtJ6oBk8YkKAzD1cSACDbtGOlTA==

can confirm it still fails with the latest dotnet 7.0.11, but without updating other packages

pavelsavara commented 1 year ago

image

y2kbugger commented 1 year ago

make sure to pull the latest commit. https://github.com/y2kbugger/ReproduceDeadlockIssueBlazorWasmNet7/commit/f221795f3d51590a2fef646416cc0a779d63f98d

y2kbugger commented 1 year ago

I've stood up a new cosmos client. Here is the new key: QL1cYvWmgVoSKMfwGyKl0Ojro1FwZCpcdpT6cK9cDLzFZJgMVKfVtvfyGr2RtJ6oBk8YkKAzD1cSACDbtGOlTA==

can confirm it still fails with the latest dotnet 7.0.11, but without updating other packages

Also confirmed that it still locks in:

image

pavelsavara commented 1 year ago

It's spinning inside of interp, never finished the _schedule_background_exec

image

pavelsavara commented 1 year ago

That could be user code spinning, but it's hard to tell what.

y2kbugger commented 1 year ago

Even if you comment out the rest of the example, it prints the "Client Created" and still hangs:

    private async Task IncrementCount()
    {
        Console.WriteLine(RuntimeInformation.FrameworkDescription);
        Console.WriteLine(RuntimeInformation.OSDescription);

        currentCount++;
        Console.WriteLine("Creating client");

        // Create a new client
        var c = new CosmosClient(
            CosmosEndpoint, CosmosKey,
            new CosmosClientOptions()
                {
                    ApplicationName = "Test",
                    ConnectionMode = ConnectionMode.Gateway,
                }
            );
        Console.WriteLine($"Client created: `{c.ToString()}`");
    }
y2kbugger commented 1 year ago

I wish I knew how to debug in the browser with symbols, but it's too far away from my day job to spend time figuring that out as we have a workaround in place for this. e.g. serverless REST layer.

But according to microsoft, direct access via a browser is a supported use case for Cosmos, and they even have it in their Javascript sdk, so this still should be fixed since it's their spa framework.

Once you set the CORS rules for your Azure Cosmos account, a properly authenticated request made to the service from a browser client will be evaluated to determine whether it is allowed according to the rules you have specified.

https://learn.microsoft.com/en-us/javascript/api/overview/azure/cosmos-readme?view=azure-node-latest

maraf commented 10 months ago

I wasn't able to figure out yet why it's getting blocked

Avenged commented 7 months ago

I'm facing the same issue using .NET 8 and Blazor WASM

pavelsavara commented 7 months ago

Could you please re-compile with <WasmNativeStrip>false</WasmNativeStrip> and share the stack-trace of the deadlock ?

y2kbugger commented 2 months ago

Could you please re-compile with <WasmNativeStrip>false</WasmNativeStrip> and share the stack-trace of the deadlock ?

sorry I didn't see this. I'll see what I can do tomorrow. I have moved on from .Net stack for new projects but this would be interesting to know. Because we are building enterprise apps, within a private network, monolithic wasm apps would have been great. instead we had to create and host an api which added no value, but was a large complexity burden.

Avenged commented 2 months ago

Related to this https://github.com/Azure/azure-cosmos-dotnet-v3/issues/4551

pavelsavara commented 2 months ago

As I'm thinking about it I realized that native stack trace would be again just stack trace of the Mono interpreter.

Perhaps it would be better to build debug version of the app and stop in there in C# debugger (VS or VS code). To see what's the managed stack at the time when it's dead-locked.

I wonder if that's some abandoned Task which then never resolves or if that's something spinning busy-loop. Does the browser UI/DOM respond, does it render ?

Do the JS events tick ? Easy test

setInterval(()=>console.log("UI alive" + new Date().toISOString()),1000);