Open kg opened 8 months ago
Tagging subscribers to 'arch-wasm': @lewing See info in area-owners.md if you want to be subscribed.
Author: | kg |
---|---|
Assignees: | - |
Labels: | `arch-wasm`, `area-VM-threading-mono` |
Milestone: | - |
cc @steveisok
@pavelsavara @kg Ilets figure out what the right behavior here is and make sure it happens
I believe that mono_thread_manage_internal
is waiting for all threads to shutdown. And it has infinite wait time.
https://github.com/dotnet/runtime/blob/330b70cfacc7751ab5ac546ed7138cb8a09c3097/src/mono/mono/metadata/threads.c#L2907-L2911
Or do I read it wrong ?
I would prefer if there was "give up" timeout. Similar to MONO_SLEEP_ABORT_LIMIT
of the GC's stop-the-world.
I believe that
mono_thread_manage_internal
is waiting for all threads to shutdown. And it has infinite wait time.Or do I read it wrong ?
I would prefer if there was "give up" timeout. Similar to
MONO_SLEEP_ABORT_LIMIT
of the GC's stop-the-world.
An abort timeout doesn't make a ton of sense to me, we're already tearing down and a timeout means we failed to tear down correctly. I suppose it would mean that this would potentially "work", but I think it would be papering over real issues with shutdown.
I mean abort (fast with non-zero exit code) is better than deadlock/timeout with no exit code at all and no message.
and we should fix the managed Exit from thread too. I didn't mean to discard that.
As of somewhat recently,
mono_wasm_exit
now callsmono_jit_cleanup
. I believe this is correct.However,
mono_jit_cleanup
callsmono_thread_manage_internal
which (despite its name) appears to be the 'shutdown this thread' API.mono_thread_manage_internal
callsmono_runtime_try_shutdown
and critically, ifmono_runtime_try_shutdown
fails, the current thread then blocks waiting for runtime shutdown to finish. The assumption here is that ifmono_runtime_try_shutdown
failed, that means another thread is currently shutting down the runtime.The problem is that the current thread could be the one shutting down the runtime. If you call
System.Environment.Exit
, it also callsmono_runtime_try_shutdown
in order to begin shutting down the runtime. This also seems correct. Then it attempts to callexit
, which seems to flow through tomono_wasm_exit
, triggering shutdown again and hanging. I don't know of any easy way to fix this.My proposed fix is that the 'are we shutting down the runtime' flag needs to become a 'which thread is shutting down the runtime' variable, so that if a thread attempts to start shutdown a second time, it won't hang. The implementations of S.E.E and
mono_wasm_exit
both seem right so I think the internals just need to be fixed.As a result of this S.E.E. is broken for wasm apps on main right now, at least in my local testing. It was previously broken and then fixed in 8.0 by me and zoltan, so we probably should add a WBT test for it this time when we fix it again. It feels likely that it is at risk of future regressions.