mstange / pdb-addr2line

A rust crate to symbolicate addresses from PDBs, like addr2line. Uses the `pdb` crate.
https://docs.rs/pdb-addr2line
Apache License 2.0
33 stars 4 forks source link

Strange symbols in ntkrnlmp.pdb #46

Closed mstange closed 3 years ago

mstange commented 3 years ago
% curl -o ntkrnlmp.pdb -L "https://msdl.microsoft.com/download/symbols/ntkrnlmp.pdb/1B4A6F5E0766C552C90710C8ACC0295C1/ntkrnlmp.pdb"
% pdb-addr2line -fC -e ntkrnlmp.pdb 0x46653b
??_C@_1DA@HOOFFHMM@?$AAK?$AAe?$AAr?$AAn?$AAe?$AAl?$AA?9?$AAM?$AAU?$AAI?$AA?9?$AAL?$AAa?$AAn?$AAg@FNODOBFM@
??:?

It's spitting out an unrelated string symbol, which happens to be in the .text section, for some reason.

mstange commented 3 years ago

Sections:

index, rwx, virtual address range, name
  0x1, r--, 0x00001000-0x000c9000 .rdata
  0x2, r--, 0x000c9000-0x00130e00 .pdata
  0x3, r--, 0x00131000-0x00133200 .idata
  0x4, r--, 0x00134000-0x0014ce00 .edata
  0x5, r--, 0x0014d000-0x0014d200 PROTDATA
  0x6, r--, 0x0014e000-0x00156c00 GFIDS
  0x7, r--, 0x00157000-0x00157000 Pad1
  0x8, r-x, 0x00200000-0x005c6a00 .text
  0x9, r-x, 0x005c7000-0x0098be00 PAGE
  0xa, r-x, 0x0098c000-0x009b1200 PAGELK
  0xb, r-x, 0x009b2000-0x009b2600 POOLCODE
  0xc, r-x, 0x009b3000-0x009b8c00 PAGEKD
  0xd, r-x, 0x009b9000-0x009eb200 PAGEVRFY
  0xe, r-x, 0x009ec000-0x009ee600 PAGEHDLS
  0xf, r-x, 0x009ef000-0x009f5a00 PAGEBGFX
 0x10, r-x, 0x009f6000-0x00a0f600 INITKDBG
 0x11, r-x, 0x00a10000-0x00a11800 TRACESUP
 0x12, r-x, 0x00a12000-0x00a14400 KVASCODE
 0x13, r-x, 0x00a15000-0x00a15800 RETPOL
 0x14, r-x, 0x00a16000-0x00a18600 MINIEX
 0x15, r-x, 0x00a19000-0x00aa3c00 INIT
 0x16, r-x, 0x00aa4000-0x00aa4000 Pad2
 0x17, rw-, 0x00c00000-0x00c13000 .data
 0x18, rw-, 0x00cfa000-0x00cfb400 ALMOSTRO
 0x19, rw-, 0x00d22000-0x00d22200 CACHEALI
 0x1a, rw-, 0x00d2c000-0x00d2d800 PAGEDATA
 0x1b, rw-, 0x00d3f000-0x00d47000 PAGEVRFD
 0x1c, rw-, 0x00d55000-0x00d55800 INITDATA
 0x1d, rw-, 0x00d6d000-0x00d6d000 Pad3
 0x1e, rw-, 0x00e00000-0x00e01e00 CFGRO
 0x1f, rw-, 0x00e02000-0x00e02000 Pad4
 0x20, r--, 0x01000000-0x01033200 .rsrc
 0x21, r--, 0x01034000-0x0103da00 .reloc

The symbol it found is at 0x40f5e0, which is a long way before 0x46653b.

8   20f5e0  public  ??_C@_1DA@HOOFFHMM@?$AAK?$AAe?$AAr?$AAn?$AAe?$AAl?$AA?9?$AAM?$AAU?$AAI?$AA?9?$AAL?$AAa?$AAn?$AAg@FNODOBFM@

There are many procedure symbols between 0x40f5e0 and 0x46653b, for example ExpTimeZoneDpcRoutine$filt$3 at 0x413351. But then there's a long symbol-free stretch, and 0x46653b falls into that stretch. There is actual program code here, the functions are just not named. For example, the function that contains 0x46653b starts at 0x46649b. This can be seen by running Hopper on the corresponding dll:

% curl -o ntoskrnl.exe -L "https://msdl.microsoft.com/download/symbols/ntoskrnl.exe/7A04BFB21046000/ntoskrnl.exe"
mstange commented 3 years ago

There is a SeparatedCode symbol at 0x46649c!

mstange commented 3 years ago

And its parent_offset points to 0x1a6ad0 which is EtwpLogContextSwapEvent.

mstange commented 3 years ago

49 fixes the first part of this: We will now return no function at all, rather than the wrong function ??_C@_1DA@HOO....

mstange commented 3 years ago

And #50 makes us find the correct function name in the example.