rmusser01 / pefile

Automatically exported from code.google.com/p/pefile
Other
0 stars 1 forks source link

Running out of memory while parsing a file #44

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
While parsing the attached file pefile is using over 4gb of memory and running 
out of memory on my system
Code example:

import pefile
fname = "huxley.dll"
pe=pefile.PE(fname)

Running it on windows.

It's 12mb compressed so here is a place to download it:
https://dl.dropboxusercontent.com/u/44015435/huxley.7z

Original issue reported on code.google.com by gera...@gmail.com on 23 Sep 2013 at 1:51

GoogleCodeExporter commented 9 years ago
Hi,

The file is a rather massive PE. I was able to process it but it took over 7GB 
of memory to load and produced 47MB of text output. While I would love to have 
pefile be a bit more nimble I don't think it'll be trivial to reduce much the 
memory usage in this case. The file has a very large number of relocation 
entries.

The text dump (produced by dump_info() ) is 2196000 lines. Of which 2149378 are 
relocation entries. If  the file is loaded skipping the relocations:

pe = pefile.PE("huxley.dll", fast_load=True)
pe.parse_data_directories( directories=[
    pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_IMPORT'],
    pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_EXPORT'],
    pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_RESOURCE'],
    pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_DEBUG'],
    # pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BASERELOC'], # <- skip this
    pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_TLS'],
    pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT'],
    pefile.DIRECTORY_ENTRY['IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT'] ] )

That will take "only" a bit over 2GBs of memory and produce a 17KB text output 
in a few seconds. I hope that is a reasonable workaround for your application.

Original comment by ero.carr...@gmail.com on 9 Oct 2013 at 10:01

GoogleCodeExporter commented 9 years ago
Thanks, that's the workaround we went with as well.

Original comment by gera...@gmail.com on 9 Oct 2013 at 10:25