scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.4k stars 521 forks source link

Changing Chart Repair Error #396

Open ghost opened 6 years ago

ghost commented 6 years ago

Hey!

I'm trying to change the data within an existing chart. Just to start out, I'm trying to use the exact code in the API documentation:

slide = prs.slides[1]
for shape in slide.shapes:
    if not shape.has_text_frame:
        if shape.has_chart:
            chart = shape.chart
            print ' found a chart'     #this line prints 
            chart_data = ChartData()
            chart_data.categories = 'Foobar', 'Barbaz', 'Bazfoo'
            chart_data.add_series('New Series 1', (5.6, 6.7, 7.8))
            chart_data.add_series('New Series 2', (2.3, 3.4, 4.5))
            chart_data.add_series('New Series 3', (8.9, 9.1, 1.2))

            chart.replace_data(chart_data)

This code runs fine but then when I open the output document, I get a dialogue box that says: PowerPoint found a problem with the content in [new_file_name.pptx]. Powerpoint can attempt to repair the presentation. If you trust the source of this presentation, click Repair.

Once I click "Repair", all the contents on slide[1] (the slide I'm changing the chart of) is completely gone.

This doesn't seem right since I'm using the exact code provided from the api.

Have you ever encountered this issue? I'm using Python 2.7

pysailor commented 6 years ago

Some hints that might help: I had that same error when I was working with an existing presentation and used deepcopy to copy slides or slide templates.

After a lot of debugging I realised this comes from duplicate entries inside the pptx (zip). When I tried to unzip the pptx that I created (dummy.pptx), I was asked about how to handle a number of duplicate files. E.g. "replace ppt/slideMasters/slideMaster17.xml? [y]es, [n]o, [A]ll, [N]one, [r]ename: y"

And indeed, when I listed the contents of the pptx (zip) and greped for that name, I saw that the file was present twice:

 unzip -l dummy.pptx |grep slideMaster17

12052  2018-04-03 18:16   ppt/slideMasters/slideMaster17.xml
 2168  2018-04-03 18:16   ppt/slideMasters/_rels/slideMaster17.xml.rels
12052  2018-04-03 18:16   ppt/slideMasters/slideMaster17.xml
 2168  2018-04-03 18:16   ppt/slideMasters/_rels/slideMaster17.xml.rels

What do you see when you inspect your generated .pptx that way? Are there any files in there that exist twice under the same path? If yes, this is what causes the error message in Powerpoint. And the use of deepcopy is a likely reason for such a duplication. (Not sure if this applies to your case, but maybe this helps.)

(See also https://github.com/scanny/python-pptx/issues/87#issuecomment-40720420).

scanny commented 6 years ago

There are a lot of reasons for a repair error. Basically it happens whenever the XML is invalid. Doing a deepcopy() is one good way :) But there are many others.

Generally we like to prevent folks from getting one of these when they stick to the published API. If we get to the bottom of this one I might add a check on .replace_data() to make sure the existing chart is compatible with the new chart data.