Open lambdageek opened 3 years ago
Tagging subscribers to this area: @CoffeeFlux See info in area-owners.md if you want to be subscribed.
Author: | lambdageek |
---|---|
Assignees: | lambdageek |
Labels: | `area-VM-meta-mono` |
Milestone: | 6.0.0 |
Repro https://github.com/lambdageek/repro-gh-21032
(to run with mono on dotnet/runtime, need to cd X && dotnet publish --self-contained -r osx-x64
and then replace libcoreclr and System.Private.CoreLib by ones from a build of mono from dotnet/runtime)
This is hard https://github.com/mono/mono/issues/21032#issuecomment-827274328
It's essentially equivalent to having class loader levels - in particular being able to talk, outside of the loader lock, about a class without its parent being set.
One other, more targeted approach that is worth considering is to have some way for class creating and class initialization to return a "I need to load an assembly, please load it and retry" value. That means that instead of always just calling the loader and failing the class (setting MonoClass:has_failure
) if it doesn't succeed, we have a flag on a class that says "I depend on something", and checking for that in many places in class loading.
Another stack trace with the same issue https://github.com/dotnet/runtime/issues/62419 - sporadic hangs in the System.Text.RegularExpressions.Tests
testsuite in Release mode with interpreter.
Another idea (particularly for mono_class_create_from_typedef
) is to pre-trigger assembly loading. We know we will need to look at the parent type and the interfaces of the class we want to instantiate. So we can go look at the corresponding metadata and figure out if we need to access other assemblies and trigger the assembly loading before we take the loader lock in mono_class_create_from_typedef
before we create the initial MonoClass*
object.
I'm not 100% sure if this is workable for generics (the parent might be a generic instance which mentions the type itself) but it's not too different, in principle from how we form the MonoImageSet*
in order to properly parent a generic instance.
I'm also not sure if there's a way to do this efficiently without slowing down app loading.
Another option (particularly for #93686) is to postpone firing managed post-load callbacks until we are out of the loader lock. (Although this also might not be workable: it's a recursive lock, so we're not really sure how long we're postponing)
Another idea (particularly for
mono_class_create_from_typedef
) is to pre-trigger assembly loading. We know we will need to look at the parent type and the interfaces of the class we want to instantiate. So we can go look at the corresponding metadata and figure out if we need to access other assemblies and trigger the assembly loading before we take the loader lock inmono_class_create_from_typedef
before we create the initialMonoClass*
object.
So I started prototyping this, and I think I see the place where I will run into problems. The issue is that when we're parsing a generic instance type from metadata, we're already creating MonoType
s for all the type arguments (and the MonoType
for a reference type already contains a MonoClass*
- which means that the if the preloader wants to parse a generic inst it will call right back into class loading). (And that might be fine - except that we need to avoid cycles, so we need to pass the preloader's "visited typedefs" set through the metadata parser - and I was hoping to avoid adding a thread-local set).
So one way to work around this is to generalize the metadata parser to do some kind of callback/visitor mechanism (or just copypasta the metadata parser).
I'm going to try to prototype this a little bit more just to see if I can get something working, but I'm not really happy with this design.
The right solution still feels like having some kind of a "this MonoClassDef
has been barebones reserved, but all we know is its parent MonoImage
and type definition token" class loader level. That way we can use the normal parser (which would put classes into this state initially and then we would take the global lock when we go to resolve the parents at which point we wouldn't need to call back into managed at all)
The original issue is https://github.com/mono/mono/issues/21032 Adding a copy here to make sure we do this work for .NET 6, too. The assembly loading algorithm is different, but the same situation is possible
Sample stack trace from VS Mac https://gist.github.com/Therzok/df71c056b0dabc8a2f3ffb6f1c264488
Thread 8 calls
mono_class_create_from_typedef
which then needs to get the parent type and callsmono_class_get_checked
... etc... until we need to load an assembly, and when we can't find it, we call the managed assembly resolver event frommono_domain_assembly_postload_search
.The managed code tries to lock some managed data structure, but the monitor is already locked, so it waits.
Meanwhile other threads that want to do any type creation are deadlocked because they can't take the loader lock.