Stan1989 / volatility

Automatically exported from code.google.com/p/volatility
GNU General Public License v2.0
0 stars 0 forks source link

linux dwarfdump parser to vtypes subs unsigned for signed #414

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
This is a reminder to fix the issue discussed in "[Vol-Users] Incorrect 
addresses in linux_proc_maps" by edwin smulders. In particular, the latest 
Ubuntu kernel 3.5.x profiles are created incorrectly, the dwarfparser code says 
many unsigned numbers are signed, so they end up negative often. 

Original issue reported on code.google.com by michael.hale@gmail.com on 22 Apr 2013 at 5:24

GoogleCodeExporter commented 9 years ago

Original comment by michael.hale@gmail.com on 22 Apr 2013 at 5:25

GoogleCodeExporter commented 9 years ago
Half the examples on the mailing list are pastebins that have expired.  Please 
can we get copies of any pertinent information, so we can figure out how the 
dwarfparser's failing?  Specifically, a profile that exhibits the problem, and 
the name of a value that's incorrect, would be best to track down what's going 
wrong and why...

Original comment by mike.auty@gmail.com on 27 Apr 2013 at 2:12

GoogleCodeExporter commented 9 years ago
I did not realise this issue would drag on a bit and the pastebins would expire 
so quickly. I will give you some names right now, and some more permanent data 
first thing monday morning, I'll attach it to this issue if possible.

Profile: Any 32 bit Ubuntu 12.04 LTS
Incorrect value: mm_struct.start_stack

Original comment by edwin.sm...@gmail.com on 27 Apr 2013 at 2:23

GoogleCodeExporter commented 9 years ago
Thanks very much Edwin, it was more a comment for MHL who opened the bug.  
Ideally they'll always contain all the information needed about the problem to 
help fix it.  I'm afraid I don't have access to an Ubuntu profile, it was more 
a request to MHL who may be able to supply one since he suggested he'd been 
able to replicate the situation.

Original comment by mike.auty@gmail.com on 27 Apr 2013 at 2:30

GoogleCodeExporter commented 9 years ago
Ah sorry, I really intended Andrew to fix this bug and he has all the 
details...so it was basically a reminder for him. However it may be better if 
you lead this one Ikelos. 

Here's Edwin's profile attached. I grepped for mm_struct.mmap_base and found 
this:

<2><0x1ee2><DW_TAG_member> DW_AT_name<"mmap_base"> DW_AT_decl_file<0x0000000c 
include/linux/mm_types.h> DW_AT_decl_line<0x00000135> DW_AT_type<<0x0000010b>> 
DW_AT_data_member_location<DW_OP_plus_uconst 20>

As you can see, the DW_AT_type is 0x0000010b or 0x10b in short. Before the 
dwarfparser parses those individual structure member lines, it builds a 
dictionary where 0x10b should be one of the keys and its corresponding C type 
is the value. This dictionary is built using lines from the same module.dwarf:

<1><0x10b><DW_TAG_typedef> DW_AT_name<"__kernel_ssize_t"> 
DW_AT_decl_file<0x00000002 include/asm-generic/posix_types.h> 
DW_AT_decl_line<0x00000044> DW_AT_type<<0x00000057>>

So there the 0x10b is extracted and associated with C type __kernel_ssize_t 
(which is signed size_t). However, we know mm_struct.mmap_base should be 
unsigned. 

The problem (I believe) is that there's also the following line in Edwin's 
profile:

<1><0x10b><DW_TAG_base_type> DW_AT_byte_size<0x00000004> 
DW_AT_encoding<DW_ATE_unsigned> DW_AT_name<"long unsigned int">

This also has key 0x10b but its C type is long unsigned int, which would be 
accurate. I believe when there are two possible keys for the same 0x10b value, 
our parser is choosing the wrong one. 

Andrew would you agree or do you think there's another explanation? 

Original comment by michael.hale@gmail.com on 27 Apr 2013 at 3:40

Attachments:

GoogleCodeExporter commented 9 years ago
Ok, so first look, this appears to be dwarfdump taking apart two files in some 
way.  They're both listed in the .debug_info section, but they're for different 
files, here are the first level DWARF lines for both:

<0><0x0+0xb><DW_TAG_compile_unit> DW_AT_producer<"GNU C 4.6.3"> 
DW_AT_language<DW_LANG_C89> 
DW_AT_name<"/home/dutchy/volatility-svn/tools/linux/module.c"> 
DW_AT_comp_dir<"/usr/src/linux-headers-3.5.0-25-generic"> 
DW_AT_low_pc<0x00000000> DW_AT_high_pc<0x00000000> DW_AT_stmt_list<0x00000000>

<0><0x19244+0xb><DW_TAG_compile_unit> DW_AT_producer<"GNU C 4.6.3"> 
DW_AT_language<DW_LANG_C89> 
DW_AT_name<"/home/dutchy/volatility-svn/tools/linux/module.mod.c"> 
DW_AT_comp_dir<"/usr/src/linux-headers-3.5.0-25-generic"> 
DW_AT_low_pc<0x00000000> DW_AT_high_pc<0x00000000> DW_AT_stmt_list<0x00000c8a>

So the first one is for module.c, and the second is module.mod.c (which 
contains the wrong 0x10b) which is created by the kernel build system to 
contain information such as versioning, etc.  They're both compiled to .o 
files, and then linked to form the final .ko.

It seems we had a slight flaw in the regular expression statements we used for 
parsing the hex-format that newer versions of dwarfdump produce, so they were 
simply ignoring the root nodes (those starting <0>) because they were followed 
by <0x####+0x###>.

The following patch should fix that, and produce trees with a single specific 
root.  That should mean that the symbols which follow take their type from the 
correct tree (as opposed to keeping everything in one big tree).  I haven't 
followed through exactly what the dwarf.py code is doing, so someone more 
familiar with it should check this over, but the output now produces a 
significantly larger number of unsigned types than before when diffing the two 
outputs (the patch also makes the dwarf.py file stand on it's own if necessary, 
which makes debugging much easier).

Original comment by mike.auty@gmail.com on 27 Apr 2013 at 4:46

Attachments:

GoogleCodeExporter commented 9 years ago
Seems to work nicely, Mike. Nice job. Here's the output of volshell's dt() 
command on mm_struct after the patch:

(FYI before the patch all these unsigned longs are "int")

>>> 'mm_struct' (440 bytes)
0x0   : mmap                           ['pointer', ['vm_area_struct']]
0x4   : mm_rb                          ['rb_root']
0x8   : mmap_cache                     ['pointer', ['vm_area_struct']]
0xc   : get_unmapped_area              ['pointer', ['void']]
0x10  : unmap_area                     ['pointer', ['void']]
0x14  : mmap_base                      ['unsigned long']
0x18  : task_size                      ['unsigned long']
0x1c  : cached_hole_size               ['unsigned long']
0x20  : free_area_cache                ['unsigned long']
0x24  : pgd                            ['pointer', ['__unnamed_0xa4f']]
0x28  : mm_users                       ['__unnamed_0x343']
0x2c  : mm_count                       ['__unnamed_0x343']
0x30  : map_count                      ['int']
0x34  : page_table_lock                ['spinlock']
0x38  : mmap_sem                       ['rw_semaphore']
0x48  : mmlist                         ['list_head']
0x50  : hiwater_rss                    ['unsigned long']
0x54  : hiwater_vm                     ['unsigned long']
0x58  : total_vm                       ['unsigned long']
0x5c  : locked_vm                      ['unsigned long']
0x60  : pinned_vm                      ['unsigned long']
0x64  : shared_vm                      ['unsigned long']
0x68  : exec_vm                        ['unsigned long']
0x6c  : stack_vm                       ['unsigned long']
0x70  : reserved_vm                    ['unsigned long']
.......

Edwin can you apply and test the patch as well? If you get similar results 
(i.e. successful ones) then I'd say its safe to commit...at which time we can 
remove the two manual overlays [1] for vm_start and vm_end we put in place a 
while ago. 

[1]. 
https://code.google.com/p/volatility/source/browse/trunk/volatility/plugins/over
lays/linux/linux.py#98

Original comment by michael.hale@gmail.com on 27 Apr 2013 at 9:40

GoogleCodeExporter commented 9 years ago
The patch works for me, thank you Mike, Michael.

Original comment by edwin.sm...@gmail.com on 29 Apr 2013 at 9:10

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r3399.

Original comment by michael.hale@gmail.com on 29 Apr 2013 at 8:04