Closed filipnavara closed 5 months ago
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.
FWIW, I expect that there is going to be a long tail of issues related to Win x86 exception handling because of https://github.com/dotnet/runtime/pull/99191#issuecomment-1977373355 . I think we may need to get both CoreCLR and NAOT onto same plan to make Win x86 support in NAOT productized (or rethink the test strategy for Win x86 NAOT).
FWIW, I expect that there is going to be a long tail of issues related to Win x86 exception handling because of #99191 (comment) .
Very likely. I was hoping that there was some exposure of the code during linux-x86 CoreCLR bring up but apparently there are still gaps (possibly specific to the NativeAOT integration of the code).
I think we may need to get both CoreCLR and NAOT onto same plan to make Win x86 support in NAOT productized (or rethink the test strategy for Win x86 NAOT).
I'm focusing on getting the runtime tests passing in this configuration which is attacking the "test strategy" angle. I am open to exploring the CoreCLR+funclets option too. I just expect to run into the same issues regardless of whether we hit them first in CoreCLR or NativeAOT, so it makes sense to investigate them.
exposure of the code during linux-x86 CoreCLR bring up
FWIW, linux-x86 on CoreCLR has not been brought up to our regular quality bar, and it is completely broken at the moment - https://github.com/dotnet/runtime/pull/96150#discussion_r1441676813 .
I am open to exploring the CoreCLR+funclets option too. I just expect to run into the same issues regardless of whether we hit them first in CoreCLR or NativeAOT, so it makes sense to investigate them.
Once you deal with the easy to repro functional issues, there are going to be stress related issues after that (GC reporting, etc.). We do not have the infrastructure like GC stress to find them on native AOT proactively. We depend on regular CoreCLR for that.
I suspect the cause will be switched argument order on the stack:
We do not have the infrastructure like GC stress to find them on native AOT proactively.
(I'd definitely prefer not to revive the GC stress code in NativeAOT.)
So far I tracked it down to the "tryEnd"/"tryLength" offset of EH_CLAUSE_FAULT
being incorrect in the decoded EH info. This causes it to incorrectly match by code offset during the second pass processing.
win-x86: jitdump.x86.txt win-x64: jitdump.x64.txt
X86:
3 EH table entries, 0 duplicate clauses, 3 total EH entries reported to VM
setEHcount(cEH=3)
EH#0: try [0018..002A) handled by [002E..0053) (finally)
EH#1: try [000E..002A) handled by [0060..0071) filter at [0053..0060)
EH#2: try [0004..002B) handled by [007E..008F) filter at [0071..007E)
X64:
3 EH table entries, 0 duplicate clauses, 0 cloned finallys, 3 total EH entries reported to VM
setEHcount(cEH=3)
EH#0: try [0025..0032) handled by [0043..0075) (finally)
EH#1: try [0018..003B) handled by [0090..00B0) filter at [0075..0090)
EH#2: try [000B..003C) handled by [00CB..00EB) filter at [00B0..00CB)
The end offset of try
in EH#0
on x86 is wrong.
I suspect this may be some artefact of FEATURE_EH_CALLFINALLY_THUNKS = 0
.
Confirmed my suspicion. This fixes it:
--- a/src/coreclr/jit/targetx86.h
+++ b/src/coreclr/jit/targetx86.h
@@ -54,8 +54,13 @@
#define FEATURE_EH 1 // To aid platform bring-up, eliminate exceptional EH clauses (catch, filter,
// filter-handler, fault) and directly execute 'finally' clauses.
+#ifdef FEATURE_EH_FUNCLETS
+ #define FEATURE_EH_CALLFINALLY_THUNKS 1 // Generate call-to-finally code in "thunks" in the enclosing EH region,
+ // protected by "cloned finally" clauses.
+#else
#define FEATURE_EH_CALLFINALLY_THUNKS 0 // Generate call-to-finally code in "thunks" in the enclosing EH region,
// protected by "cloned finally" clauses.
+#endif
#define ETW_EBP_FRAMED 1 // if 1 we cannot use EBP as a scratch register and must create EBP based
// frames for most methods
#define CSE_CONSTS 1 // Enable if we want to CSE constants
Several of the runtime tests fail with a similar symptom:
Logs of all failed tests that hit this symptom: testlogs.zip