Closed dmik closed 3 years ago
Some preliminary thoughts...
This is not quite true. Your code can build suitable versions of the required structures, which is relatively easy. For example, CONTEXTRECORD.ContextFlags controls what data is supplied in CONTEXTRECORD. If there is no useful data, you can skip filling in the CONTEXT_FLOATING_POINT or CONTEXT_INTEGER data. You are going to need to supply CONTEXT_CONTROL data or exceptq will not be able to generate stack traces an such.
The EXCEPTIONREGISTRATIONRECORD pointers can be NULL.
Your code needs to fill in the EXCEPTIONREPORTRECORD. You can use ExceptionInfo[] to pass additional info if that is helpful.
At this point, I am not convinced that adding an additional entry point to exceptq to implement this code is the best solution. The LIBC panic handler is also going to have better knowledge of what needs to be provided. To much existing code depends on the behavior of the existing MYHANDLER, so any new features would need to be via a MYHANDLER_EX entry point.
FWIW, you can pass a pointer to your special messages in one of the registers. This assumes that the register content is not otherwise needed for analysis.
Note that I didn't say I need a new export. I only need supporting NULL in the first three parameters and a string in the fourth parameter in the existing MYHANDLER export. I don't see how this could break backward compatibility provided that no existing code should ever call MYHANDLER as such. And it seems easier to me than building up EXCEPTIONREPORTRECORD and CONTEXTRECORD in each app that needs to call MYHANDLER directly and not from the exception handler. But OK, I may also do it myself from LIBC for now since I basically need it only in once place.
Regarding the message. Yes, when EXCEPTIONREPORTRECORD is constructed by the caller (and I actually have to do that for DosRaiseException
already), passing the custom message string in ExceptionInfo
sounds like the best option. But does MYHANDLER already support treating it as a message string in response to EXCEPTQ_DEBUG_EXCEPTION? If so, that would be perfect. If not yet, that would be perfect to have :)
I will recall what I need the string for. Right now, with abusing a 80 chars report info string, I have this in .TRP:
LIBC PANIC!! fmutex deadlock: Recursive mutex!
0x00020050: Owner=0xea280001 Se
i.e. the LIBC panic message is truncated. And I want something like this:
LIBCn version 0.1.6 / LIBCx version 0.6.9
LIBC PANIC!!
fmutex deadlock: Recursive mutex!
0x00020050: Owner=0xea280001 Self=0xea280001 fs=0x3 flags=0x0 hev=0x0001003b
Desc="test.c"
Note that the first line is that report info string set by the D
option and the rest is what I suggest to pass either in ExceptionInfo
or as a fourth argument.
The idea of abusing registers for the string I really don't like.
Passing the string pointer in ExceptionInfo is definitely a better idea that abusing some poor register. Abusing a registers was just the best idea I had a the moment.
I'm not going to modify exceptq to abuse the existing MYHANDLER parameters. The existing code assumes that the parameters are passed as defined by kernel. Also, MYHANDLER is far from re-entrant. If you were to call MYHANDLER directly, there is always the possibility that the kernel will call it concurrently.
Since I don't know who might already be using the EXCEPTQ_DEBUG_EXCEPTION and how they might be using the ExceptionInfo array, I will define an EXCEPTQ_DEBUG_EXCEPTION_EX that will allow up to 4 optional pointers to be passed in the array.
ExceptionInfo[0] if not NULL, will point to a string that will print in addition to the existing ReportInfo string.
If you think it will be useful for exceptions you raise, the other ExceptqInfo array entries, if provided can point to memory areas that will be printed in a format similar to the other memory dumps. For example:
Memory addressed by ExceptionInfo[1] (00062368) for 184 bytes
This is a feature that I've often wished I had at times. The registers do not always point to all the relevant memory areas of interest. I typically fall back to a memory dump when this happens, but a .trp file is generally easier to use. It's not possible to implement this easily for exceptions generated by the kernel, but it is relatively easy to implement for EXCEPTQ_DEBUG_EXCEPTION_EX, if you think it will be useful. Let me know if you want this implemented.
Your code will fill the EXCEPTIONREPORTRECORD and call DosRaiseException.
Ok, with some assembly I managed the direct MYHANDLER call approach work and now I get a .TRP report in all cases w/o even needing to install EXCEPTQ handler per-thread! Good. When I call MYHANDLER from a LIBC exception handler (where we have all three exception handler arguments non-NULL), all looks exactly like it was with per-thread EXCEPTQ handler installation.
However, when I construct EXCEPTIONREPORTRECORD by hand, provide a partial CONTEXTRECORD with CONTEXT_CONTROL | CONTEXT_SEGMENTS (and a NULL EXCEPTIONREGISTRATIONRECORD), there are some glitches. This is how it looks now:
______________________________________________________________________
Exception Report - created 2021/02/23 01:13:34
______________________________________________________________________
LIBC: Killed by SIGABRT
Hostname: novator
OS2/eCS Version: 2.45
# of Processors: 4
Physical Memory: 3260 mb
Virt Addr Limit: 3072 mb
Exceptq Version: 7.11.5-shl BETA8 (Jun 1 2020 18:37:02)
______________________________________________________________________
Exception 71785158 - Exceptq Debug Request
______________________________________________________________________
Process: D:\CODING\TESTS\LIBC\FMUTEX\TEST.EXE (02/23/2021 01:10:29 5,756)
PID: D8BE (55486)
TID: 01 (1)
Priority: 200
Filename: N/A
Cause: Invalid execution address 00000000
______________________________________________________________________
Registers
______________________________________________________________________
ESP : 0012FA40 EBP : 0012FC08 EIP : 1D2FA080 EFLG : 00000246
CS : 005B CSLIM: FFFFFFFF SS : 0053 SSLIM: FFFFFFFF
EAX : not a valid address
EBX : not a valid address
ECX : not a valid address
EDX : not a valid address
ESI : not a valid address
EDI : not a valid address
______________________________________________________________________
Stack Info for Thread 01
______________________________________________________________________
Size Base ESP Max Top
00100000 00130000 -> 0012FA40 -> 0012D000 -> 00030000
______________________________________________________________________
Call Stack
______________________________________________________________________
EBP Address Module Obj:Offset Nearest Public Symbol
-------- --------- -------- ------------- -----------------------
Trap -> 1D2FA080 LIBCN0 0001:000FA080 b_panic.c#547 ___libc_Back_panicV + 15B4 0001:000F8ACC (b_panic.obj)
...
First, There is no Filename and Cause is weird. This is because I pass NULL as XcptRepRec.ExceptionAddress
. I can put the current EIP there and then it starts looking like this:
______________________________________________________________________
Exception 71785158 - Exceptq Debug Request
______________________________________________________________________
Process: D:\CODING\TESTS\LIBC\FMUTEX\TEST.EXE (02/22/2021 22:46:22 4,733)
PID: CCDE (52446)
TID: 01 (1)
Priority: 200
Filename: D:\CODING\LIBC\MASTER-BUILD\OPT\OMF\LIBCN0.DLL (02/22/2021 23:01:04 3,620,291)
Address: 005B:1D2F9113 (0001:000F9113)
Cause: Program requested an Exceptq debug report
______________________________________________________________________
Failing Instruction
______________________________________________________________________
1D2F9107 MOV WORD [EBP-0x54], FS (8c65 ac)
1D2F910A MOV WORD [EBP-0x50], ES (8c45 b0)
1D2F910D MOV WORD [EBP-0x4c], DS (8c5d b4)
1D2F9110 MOV [EBP-0x30], EBP (896d d0)
1D2F9113 >MOV DWORD [EBP-0x2c], 0x1d2f9113 (c745 d4 13912f1d)
1D2F911A MOV WORD [EBP-0x28], CS (8c4d d8)
1D2F911D PUSHF (9c)
1D2F911E POP DWORD [EBP-0x24] (8f45 dc)
______________________________________________________________________
I.e. it refers to LIBCN0.DLL now and provides disassembly. However, this disassembly is dumb — it is always my code composing CONTEXTRECORD from the current register values, pretty useless for debugging. I prefer to not have disassembly in such a case. I guess EXCEPTQ should just accept ExceptionAddress
being NULL when ExceptionNum
is EXCEPTQ_DEBUG_EXCEPTION or EXCEPTQ_DEBUG_EXCEPTION_EX and don't bitch about a missing Filename and an invalid execution address (i.e. just omit the block containing the Filename and Cause lines).
Second, there is a block of integer registers (carrying not a valid address
since they are NULL in CONTEXTRECORD) even though CONTEXT_INTEGER is not passed in ContextFlags
. I guess EXCEPTQ should skip this block if no CONTEXT_INTEGER is present.
@StevenLevine regarding your questions. Yes, the EXCEPTQ_DEBUG_EXCEPTION_EX solution looks totally fine for me.
ExceptionInfo[0] if not NULL, will point to a string that will print in addition to the existing ReportInfo string.
More than enough! Just don't put a length limit on it please, so that LIBC panic messages would fit.
If you think it will be useful for exceptions you raise, the other ExceptqInfo array entries, if provided can point to memory areas that will be printed in a format similar to the other memory dumps.
Yes, this really makes sense. I don't have a current use case for it but I surely might. LIBC/LIBCx has a lot of info one could have a sneak peek at in case of various fatalities. So please do that.
What I would also like to have (I already mentioned this somewhere) is a way to generate a very short trap report consisting basically of the call stack trace and nothing more (useful for tracing function entry points etc). Something like that:
______________________________________________________________________
Exception Report - created 2021/02/23 01:15:51
______________________________________________________________________
LIBC PANIC!! fmutex deadlock: Recursive mutex!
0x00020050: Owner=0xd8db0001 Se
Hostname: novator
OS2/eCS Version: 2.45
# of Processors: 4
Physical Memory: 3260 mb
Virt Addr Limit: 3072 mb
Exceptq Version: 7.11.5-shl BETA8 (Jun 1 2020 18:37:02)
______________________________________________________________________
Exception 71785158 - Exceptq Debug Request
______________________________________________________________________
Process: D:\CODING\TESTS\LIBC\FMUTEX\TEST.EXE (02/23/2021 01:15:24 7,073)
PID: D8DB (55515)
TID: 01 (1)
Priority: 200
______________________________________________________________________
Call Stack
______________________________________________________________________
EBP Address Module Obj:Offset Nearest Public Symbol
-------- --------- -------- ------------- -----------------------
Trap -> 1D2FA080 LIBCN0 0001:000FA080 b_panic.c#547 ___libc_Back_panicV + 15B4 0001:000F8ACC (b_panic.obj)
0012FDD8 1D2FA60E LIBCN0 0001:000FA60E b_panic.c#151 ___libc_Back_panic + 26 0001:000FA5E8 (b_panic.obj)
0012FDF8 1D20A92F LIBCN0 0001:0000A92F fmutex.c#381 ___fmutex_request_internal + 3DF 0001:0000A550 (D:\Coding\libc\master\src\emx\src\lib\process\fmutex.c)
0012FE68 000100F6 TEST 0001:000000F6 fmutex.h#78 _main - 9E 0001:00000194 (test.c)
0012FE98 0001025A TEST 0001:0000025A test.c#63 main + C6 0001:00000194 (test.c)
0012FF60 00010027 TEST 0001:00000027 crt0.s#90 __text + 27 0001:00000000 (D:\Temp\cc4GANW3.s)
0012FF84 1D1EF751 LIBCX0 0001:0000F751 ___init_app + 11 0001:0000F740 (main.obj)
0012FFE0 1D2339CB LIBCN0 0001:000339CB appinit.s#16 ___init_app + B 0001:000339C0 (appinit.obj)
______________________________________________________________________
End of Exception Report - report took 155 ms to generate
______________________________________________________________________
Given that EXCEPTQ is already perfectly capable of appending to existing .TRP files, this could create a nice tracing report even for very frequent entry points.
This may be configured using a new option letter for SetExceptqOptions
(and EXCEPTQ env. var). For instance:
TTT
— suppress function parameter parsing while walking the call stackTTTT
— suppress all sections except header, Call Stack
and footer (i.e. just like in the example above).And also something I already requested several times: a single path for all .TRP files (set via e.g. set EXCEPTQ_PATH=C:\var\log\exceptq
). This is not always useful for the developer but it is extremely useful for end user's systems. And for the developer system as well — for apps which the developer currently is not developing (and for the app being developed, the developer can always reset EXCEPTQ_PATH to have .TRP . in the current directory, as it is now).
Note that this will require changes to the .TRP file name. Currently it is PPPP_TT.TRP where PPPP is the hex PID and TT is the TID. While it should be something like EXENAME-TIMESTAMP-PPPP_TT.TRP where EXENAME is the name of the executable and TIMESTAMP is the value of QSV_TIME_LOW. This is to make sure that .TRP reports are not overwritten because of PID clashes after reboots or such (note that we don't need SV_TIME_HIGH there because it's very unlikely that logs will be stored for THAT long :)
This is how LIBCx log file names in C:/var/log/libcx
look as an example (all threads go to a single file so there is no TID):
QTWEBENGINEPROCESS-602c4fb5-a60a.log
TEE-602c4fb6-a604.log
Also note that if you do that, EXCEPTQ should get the timestamp once for a given process (upon the first exception) and then use it for all further exceptions in that process on all of its threads. It is important both for proper grouping and for when multiple recoverable exceptions on the same thread occur (e.g. via EXCEPTQ_DEBUG_EXCEPTION) since they should end up in the same .TRP file.
This is a spin off of https://github.com/bitwiseworks/libc/issues/96#issuecomment-778613043.
Currently, trap report generation requires an EXCEPTQ exception handler to be installed on all app's threads (which is normally done by linking to LIBCx which overrides
main
and_beginthread
to do so). However, installing an exception handler for just that is not necessary per se and in fact is an abuse of the exception handler mechanism given that LIBC has its own exception handler already and that we also need those reports outside the context of exception handling at all (i.e. in a default handler of user-generated POSIX signals like SIGABRT or in own LIBC assertions where__libc_Back_panic
is called with NULL aspCtx
).The EXCEPTQ DLL already exports MYHANDLER, a trap report generation workhorse. But that one needs real EXCEPTIONREPORTRECORD, EXCEPTIONREGISTRATIONRECORD and CONTEXTRECORD pointers which we don't have outside the exception handler context.
I've already made a feature request to @StevenLevine to support NULL for those parameters. So it's currently a showstopper for this ticket.
I also asked Steven to support passing an additional zero-terminated C string of an arbitrary length via a fourth parameter of MYHANDLER and put that string into the generated .TRP file for more informativeness. The idea is to put
__libc_Back_panic
's user message into this string as it usually contains a very useful information about the trap/assertion reason.For now, .TRP are already generated in all mentioned situations (done within #96) but they require the EXCEPTQ handler to be installed and the user message length is limited to 80 chars because we abuse the EXCEPTQ "report info" string for that which is limited like that as it is intended for the app/library version information. Once Steven implements that, we will indeed put the LIBC version information in that string instead and the panic message will be passed as a fourth parameter of MYHANDLER.