Open davidfowl opened 5 years ago
I think we should remove the stack guard.
Is there a current workaround for this (aside from removing the circular references)?
No there’s no workaround.
As part of the migration of components from dotnet/extensions to dotnet/runtime (https://github.com/aspnet/Announcements/issues/411) we will be bulk closing some of the older issues. If you are still interested in having this issue addressed, just comment and the issue will be automatically reactivated (even if you aren't the author). When you do that, I'll page the team to come take a look. If you've moved on or workaround the issue and no longer need this change, just ignore this and the issue will be closed in 7 days.
If you know that the issue affects a package that has moved to a different repo, please consider re-opening the issue in that repo. If you're unsure, that's OK, someone from the team can help!
I have just come across this issue so it still needs looking at.
Paging @dotnet/extensions-migration ! This issue has been revived from staleness. Please take a look and route to the appropriate repository.
Could be similar to https://github.com/dotnet/runtime/issues/35986
@maryamariyan - was this fixed with your deadlock work in 6.0? Moving to 7.0, in case it wasn't. But please close if this is fixed in the latest.
No this isn't fixed.
I'm on net5 using M.E.DI 6.0.0-preview7 and exactly got that deadlock on StackGuard>WaitOne yesterday :-|
I believe I also just fell in this hole, on net6.0. Took quite some digging around to figure out that a circular reference was the culprit, I was certain we'd get an exception from the DI container if that happened and didn't even look at that when I started my investigation 🙈
The fix here is easy but adds overhead. We should prototype something. If anybody wants to look at this, we do a cycle check when building up the callsite, this only detects cycles in constructor visible dependencies at startup. This doesn't work for runtime resolved dependencies as the CallSiteRuntimeResolver
doesn't detect cycles.
Is there a work around to detect this during run time? We are working on a big code base that is slowly moving to dependency containers and I can already see this coming back to bite us many times...
Moved from #105900
I think the loop is caused by the lock inside the CallSiteFactory. I might be wrong, but it's a good place to look for the bug.
Edit: seeing this issue has been open for years, I doubt the solution is simple. This interests me. :) I'll download the repo and see what I can find and fix.
Edit2: tried that fix but it doesn't work. Btw it took me 3 hours just to set up the project and pass the tests.
@davidfowl I fixed it in https://github.com/wvpm/runtime/commit/f47317a6bd67ca9f7dedda944bf9f0fad1dd821d + https://github.com/wvpm/runtime/commit/8081e734abb6e5b20e8401b8a514171511340c30. All tests read green on my machine, including the scenario I added in #105900 .
Edit: there was a false circle detection when resolving the same type with different keys. The 2nd commit fixes that by using ServiceIdentifier instead of Type.
Is there any reason you didn't do the check in the CallSiteRuntimeResolver?
Is there any reason you didn't do the check in the CallSiteRuntimeResolver?
I tried that initially but it doesn't work.
You could either store the state in the CallSiteRuntimeResolver
or pass it through the functions.
Storing it in the CallSiteRuntimeResolver
caused deadlocks anyway, I think it's due to the static Instance.
Passing it through the functions doesn't work either because the Resolve
method gets called halfway through, creating new state.
Storing the state in the ServiceProviderEngineScope
works perfect.
I should note that I deleted the remote execution tests as I couldn't get them to run and thus haven't tested them.
Wouldn't it be state on the RuntimeResolverContext?
Wouldn't it be state on the RuntimeResolverContext?
@davidfowl
No, because a new context is created for every call to CallSiteRuntimeResolver.Resolve(ServiceCallSite callSite, ServiceProviderEngineScope scope)
. Only the ServiceProviderEngineScope
is passed throughout the process. That's why I added the state there.
How should we proceed with implementing the fix?
In 3.0 we rewrote the code generation logic in the DI container to make it resillent to stack overflows for deep object graphs. As a result of this change, what would previously result in a stackoverflow exception for lazily resolved circular references now results in
an infinite recursion. We're missing a number of stacks maximum.a deadlock.