Closed GoogleCodeExporter closed 9 years ago
Original comment by michael.hale@gmail.com
on 22 Apr 2013 at 5:25
Half the examples on the mailing list are pastebins that have expired. Please
can we get copies of any pertinent information, so we can figure out how the
dwarfparser's failing? Specifically, a profile that exhibits the problem, and
the name of a value that's incorrect, would be best to track down what's going
wrong and why...
Original comment by mike.auty@gmail.com
on 27 Apr 2013 at 2:12
I did not realise this issue would drag on a bit and the pastebins would expire
so quickly. I will give you some names right now, and some more permanent data
first thing monday morning, I'll attach it to this issue if possible.
Profile: Any 32 bit Ubuntu 12.04 LTS
Incorrect value: mm_struct.start_stack
Original comment by edwin.sm...@gmail.com
on 27 Apr 2013 at 2:23
Thanks very much Edwin, it was more a comment for MHL who opened the bug.
Ideally they'll always contain all the information needed about the problem to
help fix it. I'm afraid I don't have access to an Ubuntu profile, it was more
a request to MHL who may be able to supply one since he suggested he'd been
able to replicate the situation.
Original comment by mike.auty@gmail.com
on 27 Apr 2013 at 2:30
Ah sorry, I really intended Andrew to fix this bug and he has all the
details...so it was basically a reminder for him. However it may be better if
you lead this one Ikelos.
Here's Edwin's profile attached. I grepped for mm_struct.mmap_base and found
this:
<2><0x1ee2><DW_TAG_member> DW_AT_name<"mmap_base"> DW_AT_decl_file<0x0000000c
include/linux/mm_types.h> DW_AT_decl_line<0x00000135> DW_AT_type<<0x0000010b>>
DW_AT_data_member_location<DW_OP_plus_uconst 20>
As you can see, the DW_AT_type is 0x0000010b or 0x10b in short. Before the
dwarfparser parses those individual structure member lines, it builds a
dictionary where 0x10b should be one of the keys and its corresponding C type
is the value. This dictionary is built using lines from the same module.dwarf:
<1><0x10b><DW_TAG_typedef> DW_AT_name<"__kernel_ssize_t">
DW_AT_decl_file<0x00000002 include/asm-generic/posix_types.h>
DW_AT_decl_line<0x00000044> DW_AT_type<<0x00000057>>
So there the 0x10b is extracted and associated with C type __kernel_ssize_t
(which is signed size_t). However, we know mm_struct.mmap_base should be
unsigned.
The problem (I believe) is that there's also the following line in Edwin's
profile:
<1><0x10b><DW_TAG_base_type> DW_AT_byte_size<0x00000004>
DW_AT_encoding<DW_ATE_unsigned> DW_AT_name<"long unsigned int">
This also has key 0x10b but its C type is long unsigned int, which would be
accurate. I believe when there are two possible keys for the same 0x10b value,
our parser is choosing the wrong one.
Andrew would you agree or do you think there's another explanation?
Original comment by michael.hale@gmail.com
on 27 Apr 2013 at 3:40
Attachments:
Ok, so first look, this appears to be dwarfdump taking apart two files in some
way. They're both listed in the .debug_info section, but they're for different
files, here are the first level DWARF lines for both:
<0><0x0+0xb><DW_TAG_compile_unit> DW_AT_producer<"GNU C 4.6.3">
DW_AT_language<DW_LANG_C89>
DW_AT_name<"/home/dutchy/volatility-svn/tools/linux/module.c">
DW_AT_comp_dir<"/usr/src/linux-headers-3.5.0-25-generic">
DW_AT_low_pc<0x00000000> DW_AT_high_pc<0x00000000> DW_AT_stmt_list<0x00000000>
<0><0x19244+0xb><DW_TAG_compile_unit> DW_AT_producer<"GNU C 4.6.3">
DW_AT_language<DW_LANG_C89>
DW_AT_name<"/home/dutchy/volatility-svn/tools/linux/module.mod.c">
DW_AT_comp_dir<"/usr/src/linux-headers-3.5.0-25-generic">
DW_AT_low_pc<0x00000000> DW_AT_high_pc<0x00000000> DW_AT_stmt_list<0x00000c8a>
So the first one is for module.c, and the second is module.mod.c (which
contains the wrong 0x10b) which is created by the kernel build system to
contain information such as versioning, etc. They're both compiled to .o
files, and then linked to form the final .ko.
It seems we had a slight flaw in the regular expression statements we used for
parsing the hex-format that newer versions of dwarfdump produce, so they were
simply ignoring the root nodes (those starting <0>) because they were followed
by <0x####+0x###>.
The following patch should fix that, and produce trees with a single specific
root. That should mean that the symbols which follow take their type from the
correct tree (as opposed to keeping everything in one big tree). I haven't
followed through exactly what the dwarf.py code is doing, so someone more
familiar with it should check this over, but the output now produces a
significantly larger number of unsigned types than before when diffing the two
outputs (the patch also makes the dwarf.py file stand on it's own if necessary,
which makes debugging much easier).
Original comment by mike.auty@gmail.com
on 27 Apr 2013 at 4:46
Attachments:
Seems to work nicely, Mike. Nice job. Here's the output of volshell's dt()
command on mm_struct after the patch:
(FYI before the patch all these unsigned longs are "int")
>>> 'mm_struct' (440 bytes)
0x0 : mmap ['pointer', ['vm_area_struct']]
0x4 : mm_rb ['rb_root']
0x8 : mmap_cache ['pointer', ['vm_area_struct']]
0xc : get_unmapped_area ['pointer', ['void']]
0x10 : unmap_area ['pointer', ['void']]
0x14 : mmap_base ['unsigned long']
0x18 : task_size ['unsigned long']
0x1c : cached_hole_size ['unsigned long']
0x20 : free_area_cache ['unsigned long']
0x24 : pgd ['pointer', ['__unnamed_0xa4f']]
0x28 : mm_users ['__unnamed_0x343']
0x2c : mm_count ['__unnamed_0x343']
0x30 : map_count ['int']
0x34 : page_table_lock ['spinlock']
0x38 : mmap_sem ['rw_semaphore']
0x48 : mmlist ['list_head']
0x50 : hiwater_rss ['unsigned long']
0x54 : hiwater_vm ['unsigned long']
0x58 : total_vm ['unsigned long']
0x5c : locked_vm ['unsigned long']
0x60 : pinned_vm ['unsigned long']
0x64 : shared_vm ['unsigned long']
0x68 : exec_vm ['unsigned long']
0x6c : stack_vm ['unsigned long']
0x70 : reserved_vm ['unsigned long']
.......
Edwin can you apply and test the patch as well? If you get similar results
(i.e. successful ones) then I'd say its safe to commit...at which time we can
remove the two manual overlays [1] for vm_start and vm_end we put in place a
while ago.
[1].
https://code.google.com/p/volatility/source/browse/trunk/volatility/plugins/over
lays/linux/linux.py#98
Original comment by michael.hale@gmail.com
on 27 Apr 2013 at 9:40
The patch works for me, thank you Mike, Michael.
Original comment by edwin.sm...@gmail.com
on 29 Apr 2013 at 9:10
This issue was closed by revision r3399.
Original comment by michael.hale@gmail.com
on 29 Apr 2013 at 8:04
Original issue reported on code.google.com by
michael.hale@gmail.com
on 22 Apr 2013 at 5:24