Open 0xBEEEF opened 4 years ago
Thanks. This type of example is really helpful for us.
Here still to the completeness all contained classes of the program, and how these were converted according to Visual C++ in assembler.
class Base size(4):
+---
0 | data_
+---
class Der1 size(12):
+---
0 | {vfptr}
4 | {vbptr}
+---
+--- (virtual base Base)
8 | data_
+---
Der1::$vftable@:
| &Der1_meta
| 0
0 | &Der1::TestFunctionA
1 | &Der1::TestFunctionB
Der1::$vbtable@:
0 | -4
1 | 4 (Der1d(Der1+4)Base)
Der1::TestFunctionA this adjustor: 0
Der1::TestFunctionB this adjustor: 0
vbi: class offset o.vbptr o.vbte fVtorDisp
Base 8 4 4 0
class Der2 size(12):
+---
0 | {vfptr}
4 | {vbptr}
+---
+--- (virtual base Base)
8 | data_
+---
Der2::$vftable@:
| &Der2_meta
| 0
0 | &Der2::TestFunctionD
Der2::$vbtable@:
0 | -4
1 | 4 (Der2d(Der2+4)Base)
Der2::TestFunctionD this adjustor: 0
vbi: class offset o.vbptr o.vbte fVtorDisp
Base 8 4 4 0
class Join size(20):
+---
0 | +--- (base class Der1)
0 | | {vfptr}
4 | | {vbptr}
| +---
8 | +--- (base class Der2)
8 | | {vfptr}
12 | | {vbptr}
| +---
+---
+--- (virtual base Base)
16 | data_
+---
Join::$vftable@Der1@:
| &Join_meta
| 0
0 | &Join::TestFunctionA
1 | &Join::TestFunctionB
Join::$vftable@Der2@:
| -8
0 | &Join::TestFunctionD
Join::$vbtable@Der1@:
0 | -4
1 | 12 (Joind(Der1+4)Base)
Join::$vbtable@Der2@:
0 | -4
1 | 4 (Joind(Der2+4)Base)
Join::TestFunctionA this adjustor: 0
Join::TestFunctionB this adjustor: 0
Join::TestFunctionD this adjustor: 8
vbi: class offset o.vbptr o.vbte fVtorDisp
Base 16 4 4 0
class Base2 size(4):
+---
0 | data_
+---
class The1 size(36):
+---
0 | {vfptr}
4 | {vbptr}
8 | ?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@ data2_
+---
+--- (virtual base Base2)
32 | data_
+---
The1::$vftable@:
| &The1_meta
| 0
0 | &The1::NewTestFunctionA
1 | &The1::NewTestFunctionB
The1::$vbtable@:
0 | -4
1 | 28 (The1d(The1+4)Base2)
The1::NewTestFunctionA this adjustor: 0
The1::NewTestFunctionB this adjustor: 0
vbi: class offset o.vbptr o.vbte fVtorDisp
Base2 32 4 4 0
class The2 size(36):
+---
0 | {vfptr}
8 | {vbptr}
16 | data2_
| <alignment member> (size=4)
24 | data3_
+---
+--- (virtual base Base2)
32 | data_
+---
The2::$vftable@:
| &The2_meta
| 0
0 | &The2::NewTestFunctionD
The2::$vbtable@:
0 | -8
1 | 24 (The2d(The2+8)Base2)
The2::NewTestFunctionD this adjustor: 0
vbi: class offset o.vbptr o.vbte fVtorDisp
Base2 32 8 4 0
class Join2 size(100):
+---
0 | +--- (base class The1)
0 | | {vfptr}
4 | | {vbptr}
8 | | ?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@ data2_
| +---
32 | +--- (base class The2)
32 | | {vfptr}
40 | | {vbptr}
48 | | data2_
| | <alignment member> (size=4)
56 | | data3_
| +---
64 | myData
68 | ?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@ string
92 | floatingPoint
+---
+--- (virtual base Base2)
96 | data_
+---
Join2::$vftable@The1@:
| &Join2_meta
| 0
0 | &Join2::NewTestFunctionA
1 | &Join2::NewTestFunctionB
Join2::$vftable@The2@:
| -32
0 | &Join2::NewTestFunctionD
Join2::$vbtable@The1@:
0 | -4
1 | 92 (Join2d(The1+4)Base2)
Join2::$vbtable@The2@:
0 | -8
1 | 56 (Join2d(The2+8)Base2)
Join2::NewTestFunctionA this adjustor: 0
Join2::NewTestFunctionB this adjustor: 0
Join2::NewTestFunctionD this adjustor: 32
vbi: class offset o.vbptr o.vbte fVtorDisp
Base2 96 4 4 0
class SuperJoin size(124):
+---
0 | +--- (base class Join2)
0 | | +--- (base class The1)
0 | | | {vfptr}
4 | | | {vbptr}
8 | | | ?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@ data2_
| | +---
32 | | +--- (base class The2)
32 | | | {vfptr}
40 | | | {vbptr}
48 | | | data2_
| | | <alignment member> (size=4)
56 | | | data3_
| | +---
64 | | myData
68 | | ?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@ string
92 | | floatingPoint
| +---
+---
+--- (virtual base Base)
96 | data_
+---
100 | (vtordisp for vbase Join)
+--- (virtual base Join)
104 | +--- (base class Der1)
104 | | {vfptr}
108 | | {vbptr}
| +---
112 | +--- (base class Der2)
112 | | {vfptr}
116 | | {vbptr}
| +---
+---
+--- (virtual base Base2)
120 | data_
+---
SuperJoin::$vftable@The1@:
| &SuperJoin_meta
| 0
0 | &SuperJoin::NewTestFunctionA
1 | &SuperJoin::NewTestFunctionB
SuperJoin::$vftable@The2@:
| -32
0 | &SuperJoin::NewTestFunctionD
SuperJoin::$vbtable@The1@:
0 | -4
1 | 116 (SuperJoind(The1+4)Base2)
2 | 92 (SuperJoind(SuperJoin+4)Base)
3 | 100 (SuperJoind(SuperJoin+4)Join)
SuperJoin::$vbtable@The2@:
0 | -8
1 | 80 (SuperJoind(The2+8)Base2)
SuperJoin::$vftable@Der1@:
| -104
0 | &(vtordisp) SuperJoin::TestFunctionA
1 | &(vtordisp) SuperJoin::TestFunctionB
SuperJoin::$vftable@Der2@:
| -112
0 | &(vtordisp) SuperJoin::TestFunctionD
SuperJoin::$vbtable@Der1@:
0 | -4
1 | -12 (SuperJoind(Der1+4)Base)
SuperJoin::$vbtable@Der2@:
0 | -4
1 | -20 (SuperJoind(Der2+4)Base)
SuperJoin::NewTestFunctionA this adjustor: 0
SuperJoin::NewTestFunctionB this adjustor: 0
SuperJoin::NewTestFunctionD this adjustor: 32
SuperJoin::TestFunctionA this adjustor: 104
SuperJoin::TestFunctionB this adjustor: 104
SuperJoin::TestFunctionD this adjustor: 112
vbi: class offset o.vbptr o.vbte fVtorDisp
Base 96 4 8 0
Join 104 4 12 1
Base2 120 4 4 0
Thanks, your report has uncovered a number of bugs, and we've corrected some of them (public commit hopefully later today). You can find the cause of the "invalidity" with a command like this:
ooanalyzer --prolog-facts=/tmp/c.facts --prolog-results=/tmp/c.results --verbose=2 ConsoleApplication1.exe
which in this case reports (near the end of the execution):
RTTI Information is invalid because CompleteObjectLocator Offset2 = 0xc
The ooprolog.pl command should have emitted a warning too, but did not due to an unrelated bug. This is a case of us being intentionally over-constrained. I wanted to see examples of previously unseen flags in the CompleteObjectLocator RTTI data structures so that I could determine what the flags meant. The meaning of many of the flags fields are unknown as far as I can tell. :-( Here's the rule that triggered the "invalid" message:
https://github.com/cmu-sei/pharos/blob/master/share/prolog/oorules/rtti.pl#L215
Which obviously needs a clause that reads:
Offset2 \= 0xc,
(and probably one for 0x8 as well). So we know that 0xc is a valid value now, but we still don't know the meaning. I'll try to look at that some more soon, and I'll post here if I figure it out, but if you're able to understand the significance of 0xc in this field, that would be helpful. In general virtual inheritance and multiple inheritance did not receive as much testing as ordinary inheritance and other features. Our goal was to get the tool working for as many "common" cases as possible, and it's only recently that we've become much more serious about testing all of these unusual cases like you are doing. Thanks for your help with that!
Many thanks for the detailed answer! What I already noticed about multiple inheritance is that sometimes only the very first VFTable is recognized as such. All other VFTables are also used in the constructor, but they are always displayed as normal members (mbr_xyz). Ghidra shows them here in the listing, the pointers pointing to the affected VFTables. In my specific case, I'm still considering whether I should create this as a separate item, or that's enough if I add it somewhere in the open issues.
So it turns out that the Offset2 field is (surprise!) an OFFSET. I'm not sure how I ended up with code that attempts to validate this field against an expected value. Probably because the rule was written a long time ago. The most current analysis of this field seems to be from here:
https://github.com/cmu-sei/pharos/blob/master/share/prolog/oorules/facts.pl#L71
Where I've found that the field is apparently called the "constructor displacement offset", but I still haven't documented what that really means. SuperJoin is the only class that has a non-zero value in the CompleteObjectLocator fact:
rTTICompleteObjectLocator(0x403240, 0x403524, 0x405064, 0x403490, 0x70, 0xc).
rTTICompleteObjectLocator(0x4031d4, 0x403834, 0x405064, 0x403490, 0x68, 0x4).
Also regarding the BaseTable detection, these facts represent the tables (and the corresponding data from the compiler):
possibleVBTableWrite(0x4011e6, 0x4011a0, 0x4, 0x40321c). initialMemory(0x40321c, -0x4). initialMemory(0x403220, 0x74). initialMemory(0x403224, 0x5c). initialMemory(0x403228, 0x64).
SuperJoin::$vbtable@The1@: 0 | -4 1 | 116 (SuperJoind(The1+4)Base2) 2 | 92 (SuperJoind(SuperJoin+4)Base) 3 | 100 (SuperJoind(SuperJoin+4)Join)
possibleVBTableWrite(0x4011ed, 0x4011a0, 0x28, 0x403208). initialMemory(0x403208, -0x8). initialMemory(0x40320c, 0x50).
SuperJoin::$vbtable@The2@: 0 | -8 1 | 80 (SuperJoind(The2+8)Base2)
possibleVBTableWrite(0x4011f4, 0x4011a0, 0x6c, 0x4031e0). initialMemory(0x4031e0, -0x4). initialMemory(0x4031e4, -0xc).
SuperJoin::$vbtable@Der1@: 0 | -4 1 | -12 (SuperJoind(Der1+4)Base)
possibleVBTableWrite(0x4011fb, 0x4011a0, 0x74, 0x403270). initialMemory(0x403270, -0x4). initialMemory(0x403274, -0x14).
SuperJoin::$vbtable@Der2@: 0 | -4 1 | -20 (SuperJoind(Der2+4)Base)
So we've generated the facts required to detect these virtual base tables, but for some reason they weren't accepted in the final results. I'll investigate why now.
Perhaps the 0x4 and 0xc have some relation to the matching values in the SuperJoin::$vbtable@Der1@ at 0x4031e0?
So I've determined that the rules here are too restrictive for your example:
https://github.com/cmu-sei/pharos/blob/master/share/prolog/oorules/initial.pl#L99
Specifically, the first rule that reasons about additional entries in the table (after offset zero) failed for your test case. The problem seems to be that the FuncOffset clause on line 132 was not true. Perhaps this is because your constructors were inlined? This seems likely given the large number of VFTable and VBTable installations in the constructor at 0x40111a0, which I expected is SuperJoin.
possibleVBTableWrite(0x4011e6, 0x4011a0, 0x4, 0x40321c). possibleVBTableWrite(0x4011ed, 0x4011a0, 0x28, 0x403208). possibleVBTableWrite(0x4011f4, 0x4011a0, 0x6c, 0x4031e0). possibleVBTableWrite(0x4011fb, 0x4011a0, 0x74, 0x403270). possibleVFTableWrite(0x401202, 0x4011a0, 0x68, 0x4031f8). possibleVFTableWrite(0x401209, 0x4011a0, 0x70, 0x403204). possibleVFTableWrite(0x40122f, 0x4011a0, 0, 0x403230). possibleVFTableWrite(0x4012ab, 0x4011a0, 0x20, 0x40323c). possibleVFTableWrite(0x4012b7, 0x4011a0, 0, 0x403214). possibleVFTableWrite(0x4012bd, 0x4011a0, 0x20, 0x403264). possibleVFTableWrite(0x40133b, 0x4011a0, 0, 0x4031ec). possibleVFTableWrite(0x401341, 0x4011a0, 0x20, 0x40325c). possibleVFTableWrite(0x401352, 0x4011a0, 0x4, 0x4031d8). possibleVFTableWrite(0x401360, 0x4011a0, 0xc, 0x403244).
This is also an interesting case for perhaps considering the significance of the ordering of these writes...
It's also worth noting that even when we detect the VBTables correctly, it's not obvious how else it would improve the results. We don't really use the VBTables for much. I suppose we can prove certain inheritance relationships, but we've mostly obtained those directly from the RTTI information. We might also detect some object sizes better, but simply knowing that there writes into the object did most of that work (it doesn't really matter that they were VBTables). So it's unclear if failing to detect the VBTables is really related to whatever the real problems are...
Wow you've been really busy! These are really impressive facts about the program I use.
I think it's great that this information matches the information generated by the compiler.
Concerning your statement of the additional information about VBTables. I think that would have its reason. If you look at the other example of virtual inheritance in the other issue #130, there are accesses to abstract variables within the function H::access. Here the VBTables are used correctly and the values of objects are set correctly. Here this information would be incredibly important to understand the overall context.
For this reason I still think that you should pass on these VBTable structures as far as possible. In the further analysis this would help incredibly in my opinion. Maybe you could also pass on the offsets you mentioned above of the This Pointer in the structure, e.g. as a comment within the structure.
Was this RTTI problem ever fixed?
So the originally reported problem with the unrecognized RTTI data is, but to be honest I haven't followed up on the VBTables issue yet. Therefore, I can no longer give an exact status on this.
Here is an example that uses a lot of inheritance (maybe not from practice, but theoretically possible). However, this small example leads to various problems in connection with analysis and further processing.
It starts already with the analysis. Here the RTTI data is recognized as invalid, although I can import the example mentioned into Ghidra without problems and without errors.
Then it goes on that only a fraction of the analysis works. For example the VBTables are not recognized, also the subordinated VFTables are not recognized or only the very first one.
I admit that the example is a bit far-fetched. But theoretically you should be able to import the whole thing without serious errors.
The new() and delete() methods were not specially selected for this analysis. These errors can therefore be neglected.
To reproduce the case, simply compile the following example in release mode and then analyze it.
I would like to add to the result. The number of classes is correct in itself, but everything beyond that is not really. The best way to try this is to compile the example yourself in Visual Studio 2019 and do the whole process. You should get exactly the same result.