trolldbois / python-haystack

Process heap analysis framework - Windows/Linux - record type inference and forensics
http://trolldbois.blogspot.com/search?q=python-haystack
GNU General Public License v3.0
94 stars 33 forks source link

Minidump reader: File seek out of range #30

Closed rchateauneu closed 7 years ago

rchateauneu commented 8 years ago

CTypes structure is:

class struct_time_t(ctypes.Structure):
    _pack_ = True # source:False
    _fields_ = [
    ('tm_sec', ctypes.c_int32), # 0..61
    ('tm_min', ctypes.c_int32), # 0..59
    ('tm_hour', ctypes.c_int32), # 0..23
    ('tm_mday', ctypes.c_int32), # 1..31
    ('tm_mon', ctypes.c_int32), # 0..11
    ('tm_year', ctypes.c_int32), # since 1900
    ('tm_wday', ctypes.c_int32), # 0..6
    ('tm_yday', ctypes.c_int32), # 0..365
    ('tm_isdst', ctypes.c_int32),
     ]

Constraint is:

[struct_time_t]
tm_sec:     RangeValue(0,61)
tm_min:     RangeValue(0,59)
tm_hour:    RangeValue(0,23)
tm_mday:    RangeValue(1,31)
tm_mon:     RangeValue(0,11)
tm_year:    RangeValue(0,1000)
tm_wday:     RangeValue(0,6)
tm_yday:     RangeValue(0,365)
tm_isdst:   IgnoreMember

This crashes:

resultsWithConstraints = haystack.search_record(memory_handler, py_class, my_constraints)

Traceback (most recent call last):
  File "haystack_essai.py", line 63, in <module>
    resultsWithConstraints = haystack.search_record(memory_handler, py_class, my_constraints)
  File "C:\Python27\lib\site-packages\haystack-0.34-py2.7.egg\haystack\search\api.py", line 39, in search_record
    return my_searcher.search(record_type)
  File "C:\Python27\lib\site-packages\haystack-0.34-py2.7.egg\haystack\search\searcher.py", line 61, in search
    outputs.extend(self._search_in(m, struct_type, nb=max_res-len(outputs), depth=max_depth))
  File "C:\Python27\lib\site-packages\haystack-0.34-py2.7.egg\haystack\search\searcher.py", line 114, in _search_in
    instance, validated = self._load_at(mem_map, offset, struct_type, depth)
  File "C:\Python27\lib\site-packages\haystack-0.34-py2.7.egg\haystack\search\searcher.py", line 134, in _load_at
    instance = mem_map.read_struct(address, struct_type)
  File "C:\Python27\lib\site-packages\haystack-0.34-py2.7.egg\haystack\mappings\file.py", line 503, in read_struct
    self._backend.seek(self.offset + addr - self.start, 0)
ValueError: seek out of range

When changing the constraint this way, it works: tm_mday: RangeValue(0,31) Any non-zero value for the first parameter of RangeValue yields the same problem.

trolldbois commented 8 years ago

The backtrace seems to indicate an error in the back end memory/file handler. I Could you detail if you are using a haystack memory dump, or a live process, Windows or Linux ?

trolldbois commented 8 years ago

In short the bug doesn't seem related to the constraint RangeValue. But you trigger it because the constraint allows the search to ignore the first cases and find a problematic piece of the memory dump. I would need a test cases to repeat the bug, if possible ? And the debug logs... Logging.setLevel(logging.DEBUG)

rchateauneu commented 8 years ago

Windows 7, memory dump made with Task Manager.

Thanks

On Sun, Mar 13, 2016 at 12:22 AM, Loic Jaquemet notifications@github.com wrote:

The backtrace seems to indicate an error in the back end memory/file handler. I Could you detail if you are using a haystack memory dump, or a live process, Windows or Linux ?

— Reply to this email directly or view it on GitHub https://github.com/trolldbois/python-haystack/issues/30#issuecomment-195838515 .

rchateauneu commented 8 years ago

Logging ..

where is logging ? import ???

Command is: python haystack_essai.py

It crashes with any DMP file I could create with Task Manager.

On Sun, Mar 13, 2016 at 12:33 AM, Remi Chateauneu <remi.chateauneu@gmail.com

wrote:

Windows 7, memory dump made with Task Manager.

Thanks

On Sun, Mar 13, 2016 at 12:22 AM, Loic Jaquemet notifications@github.com wrote:

The backtrace seems to indicate an error in the back end memory/file handler. I Could you detail if you are using a haystack memory dump, or a live process, Windows or Linux ?

— Reply to this email directly or view it on GitHub https://github.com/trolldbois/python-haystack/issues/30#issuecomment-195838515 .

find_library("clang-3.7") libclang-3.7.so.1

find_library("clang") None

-- coding: utf-8 --

#

TARGET arch is: []

WORD_SIZE is: 8

POINTER_SIZE is: 8

LONGDOUBLE_SIZE is: 16

# import ctypes

if local wordsize is same as target, keep ctypes pointer function.

if ctypes.sizeof(ctypes.c_void_p) == 8: POINTER_T = ctypes.POINTER else:

required to access _ctypes

import _ctypes
# Emulate a pointer class using the approriate c_int32/c_int64 type
# The new class should have :
# ['__module__', 'from_param', '_type_', '**dict**', '**weakref**', '**doc**']
# but the class should be submitted to a unique instance for each base type
# to that if A == B, POINTER_T(A) == POINTER_T(B)
ctypes._pointer_t_type_cache = {}
def POINTER_T(pointee):
    # a pointer should have the same length as LONG
    fake_ptr_base_type = ctypes.c_uint64 
    # specific case for c_void_p
    if pointee is None: # VOID pointer type. c_void_p.
        pointee = type(None) # ctypes.c_void_p # ctypes.c_ulong
        clsname = 'c_void'
    else:
        clsname = pointee.__name__
    if clsname in ctypes._pointer_t_type_cache:
        return ctypes._pointer_t_type_cache[clsname]
    # make template
    class _T(_ctypes._SimpleCData,):
        _type_ = 'L'
        _subtype_ = pointee
        def _sub_addr_(self):
            return self.value
        def **repr**(self):
            return '%s(%d)'%(clsname, self.value)
        def contents(self):
            raise TypeError('This is not a ctypes pointer.')
        def **init**(self, **args):
            raise TypeError('This is not a ctypes pointer. It is not instanciable.')
    _class = type('LP_%d_%s'%(8, clsname), (_T,),{}) 
    ctypes._pointer_t_type_cache[clsname] = _class
    return _class

c_int128 = ctypes.c_ubyte_16 c_uint128 = c_int128 void = None if ctypes.sizeof(ctypes.c_longdouble) == 16: c_long_double_t = ctypes.c_longdouble else: c_long_double_t = ctypes.c_ubyte_16

class struct_test3(ctypes.Structure):

pack = True # source:False

fields = [

('val1', ctypes.c_uint32),

('val2', ctypes.c_uint32),

('me', POINTER_T(ctypes.c_uint32)),

('val2b', ctypes.c_uint32),

('val1b', ctypes.c_uint32),

]

#

class struct_Node(ctypes.Structure):

pack = True # source:False

fields = [

('val1', ctypes.c_uint32),

('PADDING_0', ctypes.c_ubyte * 4),

('ptr1', POINTER_T(None)),

('ptr2', POINTER_T(None)),

]

class struct_time_t(ctypes.Structure): pack = True # source:False fields = [ ('tm_sec', ctypes.c_int32), # 0..61 ('tm_min', ctypes.c_int32), # 0..59 ('tm_hour', ctypes.c_int32), # 0..23 ('tm_mday', ctypes.c_int32), # 1..31 ('tm_mon', ctypes.c_int32), # 0..11 ('tm_year', ctypes.c_int32), # since 1900 ('tm_wday', ctypes.c_int32), # 0..6 ('tm_yday', ctypes.c_int32), # 0..365 ('tm_isdst', ctypes.c_int32), ]

class struct_iobuf(ctypes.Structure): pack = True # source:False fields = [ ('_ptr', POINTER_T(ctypes.c_char)), ('_cnt', ctypes.c_int32), ('_base', POINTER_T(ctypes.c_char)), ('_flag', ctypes.c_int32), ('_file', ctypes.c_int32), ('_charbuf', ctypes.c_int32), ('_bufsiz', ctypes.c_int32), ('_tmpfname', POINTER_T(ctypes.c_char)), ]

class struct_stat(ctypes.Structure): pack = True # source:False fields = [ ('path', POINTER_T(ctypes.c_char)), ('buffer', POINTER_T(None)), ]

class struct_addrinfo(ctypes.Structure): pack = True # source:False fields = [ ('ai_flags', ctypes.c_int32), ('ai_family', ctypes.c_int32), ('ai_socktype', ctypes.c_int32), ('ai_protocol', ctypes.c_int32), ('ai_addrlen', ctypes.c_int64), # In fact, size_t ('ai_canonname', POINTER_T(ctypes.c_char)), ('ai_addr', POINTER_T(None)), ('ai_next', POINTER_T(None)), ]

all = \ [ 'struct_time_t', 'struct_iobuf', 'struct_stat', 'struct_addrinfo' ]

import sys

sys.path.append('python-haystack')

import haystack from haystack import memory_dumper from haystack import dump_loader from haystack import constraints

import pefile as pe

sys.path.append('../test/src/') py_modulename = 'ctypes3_gen64'

Already done, and also takes ages.

if False: memory_dumper.dump(3332,memdumpname)

from haystack import logging

logging.setLevel(logging.DEBUG)

memdumpname = '../test/src/test-ctypes3.64.dump'

machine = pe.FILE_HEADER.Machine

AttributeError: 'NoneType' object has no attribute 'FILE_HEADER'

memdumpname = r"C:\Users\rchateau\Developpement\ReverseEngineeringApps\Haystack_Tests\toto.dmp"

Conclusion: Il faut trouver moyen de dumper la memoire d;un seul bloc (Ce qui est rapide)

Ou bien de charger de la memoire au format haystack.

memdmpnames = [ r"C:\Users\rchateau\Developpement\ReverseEngineeringApps\Haystack_Tests\bash.DMP", r"C:\Users\rchateau\Developpement\ReverseEngineeringApps\Haystack_Tests\cimserver.DMP", r"C:\Users\rchateau\Developpement\ReverseEngineeringApps\Haystack_Tests\MobaXterm.DMP", r"C:\Users\rchateau\Developpement\ReverseEngineeringApps\Haystack_Tests\FlashPlayerPlugin_20_0_0_306.DMP" ]

https://github.com/trolldbois/python-haystack/blob/master/docs/Haystack%20basic%20usage.ipynb

handlerConstraints = constraints.ConstraintsConfigHandler() my_constraints = handlerConstraints.read('StructDict64.constraints')

for memdumpname in memdmpnames:

we need a memory dump loader

# This prints tons of stuff. We do not need to print them.
memory_handler = dump_loader.load(memdumpname)
print("Loaded")
print memory_handler

# load this module with haystack
my_model = memory_handler.get_model()
StructDict = my_model.import_module("StructDict64_ctypes")
print StructDict.__dict__.keys()
print StructDict.__all__
print type(StructDict)

# from StructDict import *

# for py_class in [ StructDict.struct_time_t, StructDict.struct_iobuf, StructDict.struct_stat, StructDict.struct_addrinfo ]:
for py_classnam in StructDict.__all__:
    py_class = getattr( StructDict, py_classnam )
    print("")
    # print results
    resultsNoConstraints = haystack.search_record(memory_handler, py_class)
    outNoConstraints = haystack.output_to_string(memory_handler, resultsNoConstraints)

    print("Without constraints %d" % len(resultsNoConstraints))
    for cnstrNo in resultsNoConstraints:
        print(cnstrNo)
    # print outNoConstraints

    resultsWithConstraints = haystack.search_record(memory_handler, py_class, my_constraints)
    print("With constraints %d" % len(resultsWithConstraints))
    print haystack.output_to_string(memory_handler, resultsWithConstraints)     
rchateauneu commented 8 years ago

"or a live process"

By the way is it possible to scan a process memory without writing it down to a file ??? It would be great ...

Thanks

On Sun, Mar 13, 2016 at 12:22 AM, Loic Jaquemet notifications@github.com wrote:

The backtrace seems to indicate an error in the back end memory/file handler. I Could you detail if you are using a haystack memory dump, or a live process, Windows or Linux ?

— Reply to this email directly or view it on GitHub https://github.com/trolldbois/python-haystack/issues/30#issuecomment-195838515 .

trolldbois commented 8 years ago

So, my best guess is that the Task Manager's memory dump is not a full memory dump with the actual memory sections. That would explain the traceback.

Can you run

python haystack/mappings/minidump.py <yourdumpfile>
print len([ _range for _range in [d for d in x.MINIDUMP_DIRECTORY if d.StreamType == 'Memory64ListStream'][0].DirectoryData.MINIDUMP_MEMORY_DESCRIPTOR64 ])

so we can confirm, you have no memory mappings actually in the file ? The quickest/only fix is to reacquire the memory with another tool, like the Sysinternals procexp.exe or procdump.exe, with the proper Full memory dump options.

trolldbois commented 8 years ago

To search in a live memory, try using

$ haystack-live-search --help
usage: haystack-live-search [-h] [--debug | --quiet] [--interactive]
                            [--nommap] [--constraints_file CONSTRAINTS_FILE]
                            [--extended] [--hint HINT]
                            [--string | --json | --python | --pickled]
                            pid record_type_name

Search for instance of a record_type in the allocated memory of a process. The
PID must be a running process.

positional arguments:
  pid                   Target PID on the local system
  record_type_name      Python record type name. Module must be in Python path

optional arguments:
  -h, --help            show this help message and exit
  --debug               Set verbosity to DEBUG
  --quiet               Set verbosity to ERROR only
  --interactive         drop to python command line after action
  --nommap              disable mmap()-ing
  --constraints_file CONSTRAINTS_FILE
                        Filename that contains Constraints for the record
                        types in the module
  --extended            Do not restrict the search to allocated chunks
  --hint HINT           Restrict the search to the memory page containing this
                        hint address
  --string              Print results as human readable string
  --json                Print results as json readable string
  --python              Print results as python code
  --pickled             Print results as pickled string
trolldbois commented 8 years ago
rchateauneu commented 8 years ago

This prints:

119

with a full dump created with procexp.exe with a process running cmd.exe (Command box).

On Sun, Mar 13, 2016 at 3:07 AM, Loic Jaquemet notifications@github.com wrote:

So, my best guess is that the Task Manager's memory dump is not a full memory dump with the actual memory sections. Can you run

python haystack/mappings/minidump.py print len([ _range for _range in [d for d in x.MINIDUMP_DIRECTORY if d.StreamType == 'Memory64ListStream'][0].DirectoryData.MINIDUMP_MEMORY_DESCRIPTOR64 ])

so we can confirm, you have no memory mappings actually in the file ? The quickest/only fix is to reacquire the memory with another tool, like the Sysinternals procexp.exe or procdump.exe, with the proper Full memory dump options.

— Reply to this email directly or view it on GitHub https://github.com/trolldbois/python-haystack/issues/30#issuecomment-195861166 .

trolldbois commented 8 years ago

Can you add

import logging
logging.basicConfig(level=logging.DEBUG)

at the beginning of your script. This will produce a lot of debug messages. put them in a file, and give me a link to that file, so I can debug.

rchateauneu commented 8 years ago

53 megs, 453 kbytes when 7zipped.

On Sun, Mar 13, 2016 at 4:42 PM, Loic Jaquemet notifications@github.com wrote:

Can you add

import logging logging.basicConfig(level=logging.DEBUG)

at the beginning of your script. This will produce a lot of debug messages. put them in a file, and attach these logs here, so I can debug.

— Reply to this email directly or view it on GitHub https://github.com/trolldbois/python-haystack/issues/30#issuecomment-195992553 .

trolldbois commented 8 years ago

Can you upload the zip to a file sharing platform and post the link here ?

rchateauneu commented 8 years ago

I've shared an item with you:

Haystack https://drive.google.com/folderview?id=0B7gUBxXIZBX7WmNsNnpWTjFNRDQ&usp=sharing&invite=COvD9IgB&ts=56e5e496

It's not an attachment – it's stored online. To open this item, just click
the link above.

rchateauneu commented 8 years ago

I shared it on google drive, invitation sent to your email. Thanks.

On Sun, Mar 13, 2016 at 9:51 PM, Loic Jaquemet notifications@github.com wrote:

Can you upload the zip to a file sharing platform and post the link here ?

— Reply to this email directly or view it on GitHub https://github.com/trolldbois/python-haystack/issues/30#issuecomment-196058779 .

trolldbois commented 8 years ago

Hi, could you share cimserver.DMP.d in your google drive ? It seems like a corner case, but I don't find an obvious bug in the source code.

rchateauneu commented 8 years ago

Done. Please have a look.

On Tue, Mar 15, 2016 at 6:16 PM, Loic Jaquemet notifications@github.com wrote:

Hi, could you share cimserver.DMP.d in your google drive ? It seems like a corner case, but I don't find an obvious bug in the source code.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/trolldbois/python-haystack/issues/30#issuecomment-196957088

trolldbois commented 8 years ago

Probably a duplicate of #31

trolldbois commented 8 years ago

@rchateauneu can you confirm the bug is gone ?

rchateauneu commented 8 years ago

Yes !! Thanks !!! Le 16 mars 2016 15:04, "Loic Jaquemet" notifications@github.com a écrit :

@rchateauneu https://github.com/rchateauneu can you confirm the bug is gone ?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/trolldbois/python-haystack/issues/30#issuecomment-197372221

trolldbois commented 7 years ago

issue was fixed in #31