scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.28k stars 502 forks source link

Update a chart in PowerPoint with multiple plots using Python-pptx #877

Open NunoTrigo1986 opened 1 year ago

NunoTrigo1986 commented 1 year ago

I have the following chart in PowerPoint:

image

I read the chart data in python using the python pptx library.

chart = shapes_list[19].chart 
plot = chart.plots 
len(plot)
the length of the plot is 4 in different plots.

In the four plots, just two have categories, and just one is shown on the diagram. It is plot[2]

plot_categories= [c.label for c in plot[2].categories]

the chart has four series although just three are shown.

std_start_graf_navn = plot[0].series[0].name    #series name > Std. start graf
std_navn=plot[0].series[1].name                 #series name > Standardavvik
snitt_navn = plot[2].series[0].name             #series name  > snitt
snitt1_navn= plot8[3].series[0].name            #series name  > snitt om et år

I do change data with this code:

chart_data_plot0 = CategoryChartData()
chart_data_plot0.add_series(std_start_graf_navn,t_std_start_graf_ny)

chart_data_plot2 = CategoryChartData()
chart_data_plot2.add_series(snitt_navn,t_snitt_ny)

chart_data_plot3 = CategoryChartData()
chart_data_plot3.add_series(snitt1_navn,t_snitt1_ny)

What I need help with is that when I try to replace data on one plot, the other disappears. The length of the plot goes from 4 to 1.

When I run:

plot[0].chart.replace_data(chart_data_plot0) I do get an error when trying:

plot[1].chart.replace_data(chart_data_plot2) 
plot[2].chart.replace_data(chart_data_plot3)

This happens because the original length of the plot changes from 4 to one. Is there a workaround to this issue?

riskassure commented 1 year ago

Have you figured it out yet? I am having the same issue: I have a chart with two plots. The first plot has one series and the second has two. I set up my code as follows:

chart = shapes[5].chart

chart_dat = CategoryChartData()

cats = [c.label for c in chart.plot[0].categories] chart_dat.categories = cats

for idx, srs in enumerate(chart.series): chart_dat.add_series(srs.name, dat_lists[idx], number_format='0.0%')

chart.replace_data(chart_dat)

I didn't get an error, and the chart still has two plots and three data series. However, only the first data series is updated, while the remaining are unchanged.

Dasc3er commented 11 months ago

Hi all, I share a function which works (by my tests) to update these kind of charts starting from a Pandas DataFrame. In practice, the issue is that at the XML level the tags for data vary between category based charts and not. The solution was found by looking at XML code by doing the changes via PowerPoint and comparing the resulting ones from python-pptx. Reference GIST: https://gist.github.com/Dasc3er/2af5069afb728c39d54434cb28a1dbb8

This uses another function shared at https://github.com/scanny/python-pptx/issues/132#issuecomment-1643400011

chart = shape.chart

def update(shape, data):
    shape.chart.replace_data(
        dataframe_to_chart_data(data)
    )

    # Fix for filling non category charts (XY, Bubble)
    id_attribute = '{http://schemas.openxmlformats.org/officeDocument/2006/relationships}id'

    chart_ref_id = shape.element.xpath(".//c:chart")[0].attrib[id_attribute]
    chart_part = shape.part.rels._rels[chart_ref_id].target_part
    chart_part._element.xpath(".//c:autoUpdate")[0].set("val", "1")

    point_series = chart_part._element.xpath(".//c:xVal")
    import copy
    for s in point_series:
        series_ref = s.getparent()

        # Find new reference (category information)
        x = series_ref.xpath(".//c:cat")[0]
        y = series_ref.xpath(".//c:val")[0]

        # Find existing reference (XY, Bubble)
        prev_x = series_ref.xpath(".//c:xVal")[0]
        prev_y = series_ref.xpath(".//c:yVal")[0]

        # Clean old contents
        [prev_x.remove(c) for c in prev_x.getchildren()]
        [prev_y.remove(c) for c in prev_y.getchildren()]

        # Add new contents
        [prev_x.append(c) for c in copy.deepcopy(x).getchildren()]
        [prev_y.append(c) for c in copy.deepcopy(y).getchildren()]

        # Remove category information
        series_ref.remove(x)
        series_ref.remove(y)
riskassure commented 11 months ago

This is great. The data update worked, at least in my case (a single chart with 2 plots and 3 data series). Thank you!! However, the format of the data labels got changed as well. Before, the label was formatted in 0.0%, and now it got wiped and the formatted became "general" (or no format). Is there any way to modify the update function above so that the format of the data labels is retained?

Or, another thought is: is there a way to change the format of a data series. Apparently, this cannot be done either. The situation is the same as with data update: only the format with respect to the first plot can be changed, not the rest.

riskassure commented 11 months ago

I think I found a solution: just modify your code above with two additional lines:

/

# get the existing format code reference before it gets wiped out
# put this after where y is defined
f = series_ref.xpath(".//c:formatCode")[0]

# append the format code reference after the data has been updated
# put this after where new content has been appended
[f.append(c) for c in copy.deepcopy(f).getchildren()]

So far, I haven't found any issues with it....

NunoTrigo1986 commented 2 months ago

could you publish your code. still strugling with this ... thanks