psobot / keynote-parser

A packer/unpacker for Apple Keynote presentation files.
158 stars 17 forks source link

Parser breaks when there are Charts #54

Open pxlmnstr opened 1 year ago

pxlmnstr commented 1 year ago

I am unable to parse a Keynote file that contains a ChartDrawableArchive. As soon as I delete the used pie chart from the slides, parsing works. Here is the stack trace:

Traceback (most recent call last): File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/codec.py", line 205, in from_buffer output = klass(message_payload) File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/codec.py", line 155, in FromString patched_field = proto_klass.DESCRIPTOR.fields_by_number[diff_path] KeyError: 10000

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/codec.py", line 43, in from_buffer chunk, data = IWACompressedChunk.from_buffer(data, filename) File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/codec.py", line 107, in from_buffer archive, data = IWAArchiveSegment.from_buffer(data, filename) File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/codec.py", line 207, in from_buffer raise ValueError( ValueError: Failed to deserialize functools.partial(<bound method ProtobufPatch.FromString of <class 'keynote_parser.codec.ProtobufPatch'>>, type: 0 version: 65535 version: 65535 version: 4294967295 length: 2 base_message_index: 0 diff_merge_version: 2 diff_merge_version: 3 diff_merge_version: 4294967295 diff_field_path { path: 10000 } diff_read_version: 2 diff_read_version: 0 diff_read_version: 25 , <class 'TSCHArchives_pb2.ChartDrawableArchive'>) payload of length 2: 10000

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/file_utils.py", line 230, in process process_file(filename, handle, sink, replacements, raw, on_replace) File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/file_utils.py", line 175, in process_file file = IWAFile.from_buffer(contents, filename) File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/codec.py", line 49, in from_buffer raise_from(ValueError("Failed to deserialize " + filename), e) File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/future/utils/init.py", line 403, in raise_from exec(execstr, myglobals, mylocals) File "", line 1, in ValueError: Failed to deserialize Index/Slide-1047172.iwa

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File cat_command(self.keynote_file, "Index/Document.iwa") File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/command_line.py", line 37, in cat_command process( File "/Users/x/.pyenv/versions/3.10.1/lib/python3.10/site-packages/keynote_parser/file_utils.py", line 232, in process raise ValueError("Failed to process file %s due to: %s" % (filename, e)) ValueError: Failed to process file Index/Slide-1047172.iwa due to: Failed to deserialize Index/Slide-1047172.iwa Exception ignored in: <k2pp.ListStream object at 0x151f42650> AttributeError: 'ListStream' object has no attribute 'flush'

vesko commented 1 year ago

Getting the same error here trying to parse a deck with charts. Were you able to figure out a workaround for this by any chance..?

gian-didom commented 10 months ago

Same issue here

pxlmnstr commented 10 months ago

My only Workaround was to remove the charts which was okay-ish for my use case.

flokru commented 10 months ago

I get similar problems when a Keynote file contains tables.