tcalmant / python-javaobj

Extended fork of python-javaobj from http://code.google.com/p/python-javaobj/
Apache License 2.0
81 stars 19 forks source link

Parsing a sample using a LinkedHashMap still fails #29

Closed huettenhain closed 4 years ago

huettenhain commented 4 years ago

As mentioned in #23, I am having a similar problem with a piece of serialized data. The data is from a malware sample in Java that I analyzed a while ago, so I did not write this myself. However, I was able to use jdeserialize to deserialize the data, yielding the result OssePatterned.txt.zip. Note that this file only contains a list of path names and key phrases that the malware will decrypt and load next, there is no malicious code in this file itself.

Using the following code:

import javaobj

with open("OssePatterned.jser", "rb") as fd:
    jobj = fd.read()

pobj = javaobj.loads(jobj)
print(pobj)

I get the following error:

Traceback (most recent call last):
  File "C:\Python38\lib\site-packages\javaobj\core.py", line 599, in _read_and_exec_opcode
    handler = self.opmap[opid]
KeyError: 8

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "M:\jds\decode.py", line 6, in <module>
    pobj = javaobj.loads(jobj)
  File "C:\Python38\lib\site-packages\javaobj\core.py", line 143, in loads
    return load(
  File "C:\Python38\lib\site-packages\javaobj\core.py", line 125, in load
    return marshaller.readObject(ignore_remaining_data=ignore_remaining_data)
  File "C:\Python38\lib\site-packages\javaobj\core.py", line 530, in readObject
    _, res = self._read_and_exec_opcode(ident=0)
  File "C:\Python38\lib\site-packages\javaobj\core.py", line 607, in _read_and_exec_opcode
    return opid, handler(ident=ident)
  File "C:\Python38\lib\site-packages\javaobj\core.py", line 920, in do_object
    opcode, obj = self._read_and_exec_opcode(ident=ident + 1)
  File "C:\Python38\lib\site-packages\javaobj\core.py", line 601, in _read_and_exec_opcode
    raise RuntimeError(
RuntimeError: Unknown OpCode in the stream: 0x8 (at offset 0x7C)
tcalmant commented 4 years ago

Small update on the issue:

I didn't know about jserialize, but seeing the code makes me think that the initial developper of javaobj was very inspired by it. I think I might do a full rewrite of some parts of javaobj to be closer to the jserialize implementation. The code would be easier to comment on some parts and javaobj would be closer to the specification.

huettenhain commented 4 years ago

Hey @tcalmant, thanks a lot for looking into this!

tcalmant commented 4 years ago

You can take a look at the deserialize branch, it provides a new parsing mechanism ported from the jdeserialize project.

It's still a work in progress, and maps are not converted as expected... but the file is parsed.

tcalmant commented 4 years ago

Hi,

I released version 0.4.0 of python-javaobj (both on GitHub and PyPI): it should solve your issue if you use the v2 implementation. Just replace import javaobj by import javaobj.v2 as javaobj where necessary. More information is available in the README file.

I let you close the issue if the new implementation solves it.

huettenhain commented 4 years ago

Works perfectly, thanks a lot!