dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

Circular reference results in deadlock #36458

Open davidfowl opened 5 years ago

davidfowl commented 5 years ago

In 3.0 we rewrote the code generation logic in the DI container to make it resillent to stack overflows for deep object graphs. As a result of this change, what would previously result in a stackoverflow exception for lazily resolved circular references now results in an infinite recursion. We're missing a number of stacks maximum. a deadlock.

using System;
using Microsoft.Extensions.DependencyInjection;

namespace WebApplication315
{
    public class Program
    {
        public static void Main(string[] args)
        {
            var services = new ServiceCollection();
            services.AddSingleton<Foo>();
            var serviceProvider = services.BuildServiceProvider();
            var foo = serviceProvider.GetService<Foo>();
        }
    }

    public class Foo
    {
        public Foo(IServiceProvider serviceProvider)
        {
            serviceProvider.GetService<Foo>();
        }
    }
}
davidfowl commented 5 years ago

image

davidfowl commented 5 years ago

I think we should remove the stack guard.

MrSmoke commented 4 years ago

Is there a current workaround for this (aside from removing the circular references)?

davidfowl commented 4 years ago

No there’s no workaround.

ghost commented 4 years ago

As part of the migration of components from dotnet/extensions to dotnet/runtime (https://github.com/aspnet/Announcements/issues/411) we will be bulk closing some of the older issues. If you are still interested in having this issue addressed, just comment and the issue will be automatically reactivated (even if you aren't the author). When you do that, I'll page the team to come take a look. If you've moved on or workaround the issue and no longer need this change, just ignore this and the issue will be closed in 7 days.

If you know that the issue affects a package that has moved to a different repo, please consider re-opening the issue in that repo. If you're unsure, that's OK, someone from the team can help!

tim-fernico commented 4 years ago

I have just come across this issue so it still needs looking at.

ghost commented 4 years ago

Paging @dotnet/extensions-migration ! This issue has been revived from staleness. Please take a look and route to the appropriate repository.

ericstj commented 4 years ago

Could be similar to https://github.com/dotnet/runtime/issues/35986

eerhardt commented 3 years ago

@maryamariyan - was this fixed with your deadlock work in 6.0? Moving to 7.0, in case it wasn't. But please close if this is fixed in the latest.

davidfowl commented 3 years ago

No this isn't fixed.

springy76 commented 3 years ago

I'm on net5 using M.E.DI 6.0.0-preview7 and exactly got that deadlock on StackGuard>WaitOne yesterday :-|

EnCey commented 2 years ago

I believe I also just fell in this hole, on net6.0. Took quite some digging around to figure out that a circular reference was the culprit, I was certain we'd get an exception from the DI container if that happened and didn't even look at that when I started my investigation 🙈

davidfowl commented 1 year ago

The fix here is easy but adds overhead. We should prototype something. If anybody wants to look at this, we do a cycle check when building up the callsite, this only detects cycles in constructor visible dependencies at startup. This doesn't work for runtime resolved dependencies as the CallSiteRuntimeResolver doesn't detect cycles.

lawrence-laz commented 1 year ago

Is there a work around to detect this during run time? We are working on a big code base that is slowly moving to dependency containers and I can already see this coming back to bite us many times...

wvpm commented 3 months ago

Moved from #105900

Possible solution

I think the loop is caused by the lock inside the CallSiteFactory. I might be wrong, but it's a good place to look for the bug.

Edit: seeing this issue has been open for years, I doubt the solution is simple. This interests me. :) I'll download the repo and see what I can find and fix.

Edit2: tried that fix but it doesn't work. Btw it took me 3 hours just to set up the project and pass the tests.

wvpm commented 3 months ago

@davidfowl I fixed it in https://github.com/wvpm/runtime/commit/f47317a6bd67ca9f7dedda944bf9f0fad1dd821d + https://github.com/wvpm/runtime/commit/8081e734abb6e5b20e8401b8a514171511340c30. All tests read green on my machine, including the scenario I added in #105900 .

Edit: there was a false circle detection when resolving the same type with different keys. The 2nd commit fixes that by using ServiceIdentifier instead of Type.

davidfowl commented 3 months ago

Is there any reason you didn't do the check in the CallSiteRuntimeResolver?

wvpm commented 2 months ago

Is there any reason you didn't do the check in the CallSiteRuntimeResolver?

I tried that initially but it doesn't work. You could either store the state in the CallSiteRuntimeResolver or pass it through the functions. Storing it in the CallSiteRuntimeResolver caused deadlocks anyway, I think it's due to the static Instance. Passing it through the functions doesn't work either because the Resolve method gets called halfway through, creating new state.

Storing the state in the ServiceProviderEngineScope works perfect. I should note that I deleted the remote execution tests as I couldn't get them to run and thus haven't tested them.

davidfowl commented 2 months ago

Wouldn't it be state on the RuntimeResolverContext?

wvpm commented 2 months ago

Wouldn't it be state on the RuntimeResolverContext?

@davidfowl No, because a new context is created for every call to CallSiteRuntimeResolver.Resolve(ServiceCallSite callSite, ServiceProviderEngineScope scope). Only the ServiceProviderEngineScope is passed throughout the process. That's why I added the state there.

wvpm commented 2 months ago

How should we proceed with implementing the fix?