cclgroupltd / ccl-ssns

Automatically exported from code.google.com/p/ccl-ssns
43 stars 13 forks source link

Another error from an unclean Chrome exit #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
1. Parse another unclean Chrome exit file from another PC (file fragment 
follows)
2. An struct.error occurs.

The error I'm getting is 

Processing begins...
Error reading record begining at data offset 2315120.
Error caused by: unpack requires a bytes object of length 4.
Tracebck follows for debugging:

---------------EXCEPTION BEGINS---------------
Traceback (most recent call last):
  File "ccl_ssns.py", line 435, in load_iter
    command = read_command(f)
  File "ccl_ssns.py", line 380, in read_command
    return read_tab_restore_command(command_buffer, command_id)
  File "ccl_ssns.py", line 414, in read_tab_restore_command
    state = WebHistoryItem.from_bytes(state_blob[4:]) # first 32bits is the internal pickle size. We dont' need it.
  File "ccl_ssns.py", line 150, in from_bytes
    return cls.from_stream(f)
  File "ccl_ssns.py", line 156, in from_stream
    version, = struct.unpack("<i", f.read(4))
struct.error: unpack requires a bytes object of length 4
----------------EXCEPTION ENDS----------------

NB: No further records will be read.
Processing finished.

This was with version 0.7, under Windows 8 64-bit (again with Chrome 23)

The relevant excerpt follows:

          0001 0203 0405 0607 0809 0A0B 0C0D 0E0F - 0123456789ABCDEF
--------------------------------------------------------------------
0x235370: E101 06DC 0100 006F 0400 0000 0000 005D - á..Ü...o.......]
0x235380: 0000 0068 7474 703A 2F2F 7777 772E 7265 - ...http://www.re
0x235390: 6464 6974 2E63 6F6D 2F72 2F47 616D 6573 - ddit.com/r/Games
0x2353A0: 2F63 6F6D 6D65 6E74 732F 3131 7265 6136 - /comments/11rea6
0x2353B0: 2F77 6974 685F 6761 6D65 735F 6C69 6B65 - /with_games_like
0x2353C0: 5F74 7261 696E 5F73 696D 756C 6174 6F72 - _train_simulator
0x2353D0: 5F62 6569 6E67 5F73 6F6D 6577 6861 742F - _being_somewhat/
0x2353E0: 0000 0078 0000 0057 0069 0074 0068 0020 - ...x...W.i.t.h. 
0x2353F0: 0067 0061 006D 0065 0073 0020 006C 0069 - .g.a.m.e.s. .l.i
0x235400: 006B 0065 0020 0054 0072 0061 0069 006E - .k.e. .T.r.a.i.n
0x235410: 0020 0053 0069 006D 0075 006C 0061 0074 - . .S.i.m.u.l.a.t
0x235420: 006F 0072 0020 0062 0065 0069 006E 0067 - .o.r. .b.e.i.n.g
0x235430: 002E 002E 002E 0020 0073 006F 006D 0065 - ....... .s.o.m.e
0x235440: 0077 0068 0061 0074 0020 0070 006F 0070 - .w.h.a.t. .p.o.p
0x235450: 0075 006C 0061 0072 002E 0020 0057 006F - .u.l.a.r... .W.o
0x235460: 0075 006C 0064 006E 0027 0074 0020 0061 - .u.l.d.n.'.t. .a
0x235470: 0020 0053 0070 0061 0063 0065 0073 0068 - . .S.p.a.c.e.s.h
0x235480: 0069 0070 0020 0053 0069 006D 0075 006C - .i.p. .S.i.m.u.l
0x235490: 0061 0074 006F 0072 0020 0062 0065 0020 - .a.t.o.r. .b.e. 
0x2354A0: 0069 006E 0063 0072 0065 0064 0069 0062 - .i.n.c.r.e.d.i.b
0x2354B0: 006C 0079 0020 0070 006F 0070 0075 006C - .l.y. .p.o.p.u.l
0x2354C0: 0061 0072 003F 0020 003A 0020 0047 0061 - .a.r.?. .:. .G.a
0x2354D0: 006D 0065 0073 0000 0000 0008 0000 0000 - .m.e.s..........
0x2354E0: 0000 0000 0000 0001 0000 005D 0000 0068 - ...........]...h
0x2354F0: 7474 703A 2F2F 7777 772E 7265 6464 6974 - ttp://www.reddit
0x235500: 2E63 6F6D 2F72 2F47 616D 6573 2F63 6F6D - .com/r/Games/com
0x235510: 6D65 6E74 732F 3131 7265 6136 2F77 6974 - ments/11rea6/wit
0x235520: 685F 6761 6D65 735F 6C69 6B65 5F74 7261 - h_games_like_tra
0x235530: 696E 5F73 696D 756C 6174 6F72 5F62 6569 - in_simulator_bei
0x235540: 6E67 5F73 6F6D 6577 6861 742F 0000 0000 - ng_somewhat/....
0x235550: 0000 005D CD06 58CD 0000 BE02 0000 0000 - ...]Í.XÍ..¾.....

I don't really *need* to extract data from this one, but why keep bugs? :)

Original issue reported on code.google.com by GSchizas on 15 Nov 2012 at 9:27

GoogleCodeExporter commented 9 years ago
I did some debugging myself, and it seems that the offending line is this:

414: state = WebHistoryItem.from_bytes(state_blob[4:])

as at that point state_blog is equal to b'' (it is empty), and state_length is 
equal to 0

Original comment by GSchizas on 15 Nov 2012 at 9:59

GoogleCodeExporter commented 9 years ago
I've added these lines instead:

    # Parse state
    if state_length > 4:
        #print(state_blob[4:])
        state = WebHistoryItem.from_bytes(state_blob[4:]) # first 32bits is the internal pickle size. We dont' need it.
    else:
        state = WebHistoryItem(url, None, None, None, None, None, None, None, None, 
                   None, None, None, None, None, None, None,
                   None, None, None, None)

and it seems to work...

Original comment by GSchizas on 15 Nov 2012 at 10:11

GoogleCodeExporter commented 9 years ago
I'll take a look at the structure of that fragment and see if I can get 
something from it - at a glance it doesn't look like the same problem exactly 
so I think we can still get more data from it. Expect a patch later today!

Alex

Original comment by i...@ccl-forensics.com on 15 Nov 2012 at 10:28

GoogleCodeExporter commented 9 years ago
OK, the problem was that the state-blob had zero length (which seems weird, but 
is clearly allowable) so your fix was pretty much right, although you shouldn't 
be pushing the url into the WebHistoryItem's constructor there as it's coming 
from a different part of the file structure. I've committed the change, if you 
get a chance to confirm that it's working for you now then I'll close the issue.

Thanks again for the bug reports, truly appreciate you taking the time.

Alex

Original comment by i...@ccl-forensics.com on 15 Nov 2012 at 11:14

GoogleCodeExporter commented 9 years ago
I had to come back home, but I can verify that the latest version (0.8) works 
properly with the file I had here :)

Original comment by GSchizas on 15 Nov 2012 at 7:25

GoogleCodeExporter commented 9 years ago
Great stuff, thanks for testing!

Original comment by i...@ccl-forensics.com on 16 Nov 2012 at 9:16