scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.38k stars 514 forks source link

feature: Slide.duplicate() #132

Open AlexMooney opened 9 years ago

AlexMooney commented 9 years ago

In order to create a presentation by populating template slides with dynamic data As a developer using python-pptx I need the ability to clone a slide

API suggestion:

cloned_slide = prs.slides.clone_slide(original_slide)

The cloned slide would be appended to the end of the presentation and would be functionally equivalent to copying and pasting a slide using the PPT GUI.

scanny commented 5 years ago

@jsolack show the minimum code and full stack trace, don't ask us to spend our time guessing.

marcomilov commented 5 years ago

Has anyone tried duplicating the same slide multiple times into a new pptx file? The code from @nshgraph works for me if I duplicate a slide only one time, although I always get a prompt to let powerpoint repair the file.
When duplicating more than once, the first slide will be shown but the others are just blank... I also get a lot of these error warnings when saving to pptx.

warnings
biggihs commented 5 years ago

@marcomilov My "hack" was used in such a way. It appears to me that you are not incrementing or creating new unique names for the slide objects for each duplication. You could try to duplicate the duplicated page, that should result in new names and might be an quick patch. Good luck.

motassimbakali commented 5 years ago

EDIT: Never mind :). I edited the code a little bit (removed some parts and edited one other thing) and it now works for my case. If anyone encounters the same error, this is what I did. Hopefully it can help you!

def copy_slide(pres,pres1,index):
         source = pres.slides[index]

         blank_slide_layout = _get_blank_slide_layout(pres1)
         dest = pres1.slides.add_slide(blank_slide_layout)

         for shp in source.shapes:
              el = shp.element
              newel = copy.deepcopy(el)
              dest.shapes._spTree.insert_element_before(newel, 'p:extLst')

              return dest

I have tried implementing the code of @zhong2000 . I have two seperate .pptx files with the same theme, slidemaster and slide-layouts. When I copy a slide from one presentation to the other, I get the error: UserWarning: Duplicate name: 'ppt/slideLayouts/slideLayout1.xml' (also for ppt/theme/theme1.xml and ppt/slideMasters/slideMaster1.xml).

When I look into the files within the generated PowerPoint file, I see the duplicates showing up: image

I think that the copy.deepcopy copies the slide, including it's slidemaster, slidelayouts and theme. Is there a way to only copy the slide, instead of everything (or maybe increment the index of the slidemaster, slidelayouts and theme) so I don't get a duplicate error?

When I open the saved file in PowerPoint, the slides are pasted (so copying does work). It just shows a repair-box all the time, which has to do with the above. Copying slides within one PowerPoint presentation works completely fine without getting any errors.

Code for saving the file (on top of code from @zhong2000 ):

from pptx import Presentation
import six
import copy

prs1 = Presentation('test2.pptx')
prs2 = Presentation('test.pptx')

copy_slide(prs1, prs2, 0)

prs2.save('2.pptx')

@robintw thank you With slight modification of code, it is able to copy slide from template to new ppt. It is a great improvement to me. My proj is ppt report auto generation, I made many ppt template for various requirement before . now I can summary identical format into one ppt , then do the iteration of slide copy and content substitution. known bug: bg and some format will be lost. Although it is not critical for me, hope you can help .

    def _get_blank_slide_layout(pres):
         layout_items_count = [len(layout.placeholders) for layout in pres.slide_layouts]
         min_items = min(layout_items_count)
         blank_layout_id = layout_items_count.index(min_items)
         return pres.slide_layouts[blank_layout_id]

    def copy_slide(pres,pres1,index):
         source = pres.slides[index]

         blank_slide_layout = _get_blank_slide_layout(pres)
         dest = pres1.slides.add_slide(blank_slide_layout)

         for shp in source.shapes:
              el = shp.element
              newel = copy.deepcopy(el)
              dest.shapes._spTree.insert_element_before(newel, 'p:extLst')

              for key, value in six.iteritems(source.rels):
                         # Make sure we don't copy a notesSlide relation as that won't exist
                       if not "notesSlide" in value.reltype:
                               dest.rels.add_relationship(value.reltype, value._target, value.rId)

              return dest
khouryrami commented 5 years ago

What about duplicating the layout and formatting? I am looking into taking all the layouts and designs in Slide 1 and populating them with text/content from Slide 2. The task seems to be really hectic... any recommendations would be appreciated. Thanks!

mightmay commented 5 years ago

I am also looking for this feature. My use case is merge a bunch of powerpoints into one.

I also trying to merge powerpoint files. Did anyone find any way to do it ?

baek0597 commented 5 years ago

I will try to extract some slides from seveal pptx file below code works very well sometimes, but, it doesn't work when slides include external hyperlink or something It makes error ''str' object has no attribute 'rels' when presentation save

If I remove code "for key, value in source.part.rels.items(): --------------- dest.part.rels.add_relationship(value.reltype, value._target, value.rId)" It works, but slide's quality is not good

Please Help me, I want to extract good quality slide from pptx file How can I fix error 'str' object has no attribute 'rels'? I am using python-pptx v0.6.18, python 3.7

def _get_blank_slide_layout(pres):
    layout_items_count = [len(layout.placeholders)
                          for layout in pres.slide_layouts]
    min_items = min(layout_items_count)
    blank_layout_id = layout_items_count.index(min_items)
    return pres.slide_layouts[blank_layout_id]

def copy_slide(pres, pres1, index):
    source = pres.slides[index]
    blank_slide_layout = _get_blank_slide_layout(pres)
    dest = pres1.slides.add_slide(blank_slide_layout)

    for shape in source.shapes:
        newel = copy.deepcopy(shape.element)
        dest.shapes._spTree.insert_element_before(newel, 'p:extLst')

    for key, value in source.part.rels.items():
        if not "notesSlide" in value.reltype:
            dest.part.rels.add_relationship(value.reltype, value._target, value.rId)
    return dest
micbenn commented 5 years ago

If you are looking for way to keep noteSlides in your Powerpoint, you can add the following few lines to the above codes: (adding this because I was looking around for a while)

if source.has_notes_slide:
        txt = source.notes_slide.notes_text_frame.text
        dest.notes_slide.notes_text_frame.text = txt
baek0597 commented 5 years ago

@micbenn Thanks, Still I don't solve my Problem. But, after some test, I find some cause that makes 'str' object has no attribute 'rels' error. When I save presentation that include external hyperlink shape, error occurced so I want to remove external hyperlink in shape, before presentation saved How Can I remove external hyperlink in shape? Please help me

scanny commented 4 years ago

@arosasg a placeholder on a slide is a reference to a layout-placeholder on its slide layout.

If you add a placeholder shape to a slide where its reference to a layout-placeholder is broken, you can get some of these behaviors (like going to top-left corner). This could be because you chose the wrong slide-layout for that slide or because the new slide-layout doesn't have a placeholder with the matching key (idx=...).

juria90 commented 4 years ago

In my case, my source slide has a background not as a part of master or layout, but as a ... structure. So, to make it work, I added following code at the end of the function before returning the dest.

if source.background:
    el = source.background._cSld.bg.bgPr
    newel = copy.deepcopy(el)
    cSld = dest.background._cSld
    cSld.get_or_add_bgPr()
    cSld.bg._remove_bgPr()
    cSld.bg._insert_bgPr(newel)
NarenZen commented 3 years ago

@micbenn Thanks, Still I don't solve my Problem. But, after some test, I find some cause that makes 'str' object has no attribute 'rels' error. When I save presentation that include external hyperlink shape, error occurced so I want to remove external hyperlink in shape, before presentation saved How Can I remove external hyperlink in shape? Please help me

@biggihs Did you solve the problem. I'm also facing the issue. Can you please share the solution

NarenZen commented 3 years ago

Has anyone tried duplicating the same slide multiple times into a new pptx file? The code from @nshgraph works for me if I duplicate a slide only one time, although I always get a prompt to let powerpoint repair the file. When duplicating more than once, the first slide will be shown but the others are just blank... I also get a lot of these error warnings when saving to pptx.

warnings

@marcomilov Did you solve the problem. I'm also getting duplicate warning and unable to open in Microsoft. How can we skip saving SlideLayouts.

Lirioooo commented 3 years ago

Tengo la solución a estas advertencias, lo único que tuve que hacer fue agregar un if antes del "dest.part.rels.add_relationship" de la siguiente manera:

for key, value in source.part.rels.items():
      # Make sure we don't copy a notesSlide relation as that won't exist
      if "notesSlide" not in value.reltype:
      target = value._target
      # if the relationship was a chart, we need to duplicate the embedded chart part and xlsx
      if "chart" in value.reltype:
          partname = target.package.next_partname(
              ChartPart.partname_template)
          xlsx_blob = target.chart_workbook.xlsx_part.blob
          target = ChartPart(partname, target.content_type,
                         copy.deepcopy(target._element), package=target.package)

          target.chart_workbook.xlsx_part = EmbeddedXlsxPart.new(
              xlsx_blob, target.package)

      if not "xml" in str(value.target_ref):
          if value.is_external:
              dest.part.rels.add_relationship(value.reltype, value.target_ref, value.rId, value.is_external)
          else:
              dest.part.rels.add_relationship(value.reltype, value._target, value.rId) #value.target_part

Basicamente, es un if que ignora los archivos xml en los ítems recorridos, que al final son los que se duplican en los registros al agregarlos como una relación, no aseguro que esta solución les sirva a todos pero al menos en mi caso no mostró las advertencias ni mostró el message box de powerpoint con la opción de "reparar" la presentación, y la diapositiva se duplicó exitosamente.

lthamm commented 2 years ago

I had to machine translate @Lirioooo response, but it seemed to suggest checking for .xml file endings in the target ref to avoid a corrupted file when opening a modified file. This did not work for me. I unzipped the output file and logically the slide relations don't contain the .xml relations - yet those are required for e.g. the chart to work. My current solution is to downgrade to version 0.6.19 and then the answer suggested solution should work.

With the current release (version 0.6.21) i did not come up with a working solution. I managed to adapt some parts, but the output is still a broken file. Here is my current try:

def pptx_copy_slide(pres: pptx.Presentation, source: pptx.slide.Slide):
    dest = pres.slides.add_slide(source.slide_layout)
    for shape in dest.shapes:
        shape.element.getparent().remove(shape.element)

    for shape in source.shapes:
        new_shape = copy.deepcopy(shape.element)
        dest.shapes._spTree.insert_element_before(new_shape, 'p:extLst')

    for rel in source.part.rels:
        target = rel._target

        if "notesSlide" in rel.reltype:
            continue

        if 'chart' in rel.reltype:
            # https://github.com/scanny/python-pptx/issues/132#issuecomment-414001942
            partname = target.package.next_partname(pptx.parts.chart.ChartPart.partname_template)
            xlsx_blob = target.chart_workbook.xlsx_part.blob
            target = pptx.parts.chart.ChartPart(
                partname = partname, 
                content_type = target.content_type, 
                element = copy.deepcopy(target._element),
                package=target.package)
            target.chart_workbook.xlsx_part = pptx.parts.chart.EmbeddedXlsxPart.new(
                blob=xlsx_blob, 
                package=target.package)

        if rel.is_external:
            dest.part.rels.get_or_add_ext_rel(rel.reltype, rel._target)
        else:
            dest.part.rels.get_or_add(rel.reltype, rel._target)

    return dest

Ideas why this is not working would be appreciated.

Honestly I think a different approach had some benefits, when you unzip the powerpoint file, duplicating a slide can be done fairly easily I think:

  1. Duplicate the slide file and rename it
  2. Duplicate the relation file and rename it
  3. Link the slide in the presentation relations ("rels/presentation.xml.rels") This would be fairly easy to do in python and you would account for all cases (e.g. no issues with charts or diagrams). The only problem is, that you would need to create a temp file and read it in again.
alrdebugne commented 1 year ago

Hey all, has anyone found a stable solution for copying slides? None of the above suggestions worked for me (python-pptx==0.6.19):

FWIW the use case I'm pursuing is splitting a PPT with multiple pages into multiple PPT's with a single page each.

to175 commented 1 year ago

Hi @scanny any updates for this please ?

Mike3285 commented 1 year ago

I don't know if my solution is reliable, and if it will work for all kinds of presentations and charts into them, but with this code I successfully managed to merge two presentations and obtain a clean, uncorrupted file which opened successfully in Microsoft PowerPoint without any warning.

def _get_blank_slide_layout(pres):
    layout_items_count = [len(layout.placeholders) for layout in pres.slide_layouts]
    min_items = min(layout_items_count)
    blank_layout_id = layout_items_count.index(min_items)
    layout_0 = pres.slide_layouts[blank_layout_id]
    for shape in layout_0.shapes:
        sp = shape.element
        sp.getparent().remove(sp)
    return layout_0

def move_slides(prs1, prs2):
    """Duplicate each slide in prs2 and "moves" it into prs1.
    Adds slides to the end of the presentation"""
    for slide in prs2.slides:
        sl = prs1.slides.add_slide(_get_blank_slide_layout(prs1))
        for shape in slide.shapes:
            newel = copy.deepcopy(shape.element)
            sl.shapes._spTree.insert_element_before(newel, 'p:extLst')
        try:
            sl.shapes.title.text = slide.shapes.title.text
            sl.placeholders[0].text = slide.placeholders[0].text
        except Exception as e:
            print(f"Error \"{e}\", suppressing it...")
    return prs1
Benouare commented 1 year ago

T

I don't know if my solution is reliable, and if it will work for all kinds of presentations and charts into them, but with this code I successfully managed to merge two presentations and obtain a clean, uncorrupted file which opened successfully in Microsoft PowerPoint without any warning.

def _get_blank_slide_layout(pres):
    layout_items_count = [len(layout.placeholders) for layout in pres.slide_layouts]
    min_items = min(layout_items_count)
    blank_layout_id = layout_items_count.index(min_items)
    layout_0 = pres.slide_layouts[blank_layout_id]
    for shape in layout_0.shapes:
        sp = shape.element
        sp.getparent().remove(sp)
    return layout_0

def move_slides(prs1, prs2):
    """Duplicate each slide in prs2 and "moves" it into prs1.
    Adds slides to the end of the presentation"""
    for slide in prs2.slides:
        sl = prs1.slides.add_slide(_get_blank_slide_layout(prs1))
        for shape in slide.shapes:
            newel = copy.deepcopy(shape.element)
            sl.shapes._spTree.insert_element_before(newel, 'p:extLst')
        try:
            sl.shapes.title.text = slide.shapes.title.text
            sl.placeholders[0].text = slide.placeholders[0].text
        except Exception as e:
            print(f"Error \"{e}\", suppressing it...")
    return prs1

This did the job for my case! Thx!

MartinPacker commented 1 year ago

Minor quibble: You're calling _get_blank_slide_layout for every slide you add.

More importantly, I'd be curious to know what kinds of objects are on the slides you're copying in. For example, graphics.

Mike3285 commented 1 year ago

Yeah, that one was a bit messed up. I am now using this code I made, which appears to be working in all the cases I tested.

I made a function to merge many presentation into one, by putting their paths in a list and then feeding it to the function. Here is all the code:

def merge_presentations_list(paths: list,final_name: str=None):
    """mergio lista di presentazioni pptx in una unica"""
    outputPres = Presentation(paths[0])

    for path in paths[1:]:
        templatePres = Presentation(path)
        for i in range(len(templatePres.slides)): # We iterate over all the slides
            move_slide(templatePres, i, outputPres)

    if final_name:
        outputPres.save(final_name)
    return outputPres

def move_slide(copyFromPres: Presentation, slideIndex:int , pasteIntoPres: Presentation):
    """Takes two Presentation objs and an index, copies the slide with index slideIndex from copyFromPres to
    pasteIntoPres.
    returns the slide if everything went well, but the first presentation will contain all the other one's slides
    Thanks to https://stackoverflow.com/a/73954830/12380052 from which I copied the majority of this
    """
    # modeled on https://stackoverflow.com/a/56074651/20159015
    # and https://stackoverflow.com/a/62921848/20159015
    # take the slide at index slideIndex
    slide_to_copy = copyFromPres.slides[slideIndex]
    # Selecting the layout: it should and must be only one, so we take the 1st
    slide_layout = pasteIntoPres.slide_layouts[0]
    # names of other layouts can be found here under step 3:
    # https://www.geeksforgeeks.org/how-to-change-slide-layout-in-ms-powerpoint/
    # The layout we're using has an empty title with a placeholder like "Click to add title"

    # create now slide with that layout, to copy contents to
    new_slide = pasteIntoPres.slides.add_slide(slide_layout)
    # create dict for all the images it could find
    imgDict = {} # entries will be generated if the pptx has images
    for shp in slide_to_copy.shapes:
        # Searching for images to not get a corrupt file in the end
        if 'Picture' in shp.name:
            # save image
            with open(shp.name + '.jpg', 'wb') as f:
                # Saving it temporarily
                f.write(shp.image.blob)
            # add image to dict
            imgDict[shp.name + '.jpg'] = [shp.left, shp.top, shp.width, shp.height]
        else:
            # create copy of elem
            el = shp.element
            newel = copy.deepcopy(el)
            # add elem to shape tree
            new_slide.shapes._spTree.insert_element_before(newel, 'p:extLst')
    # the following is from the guy on Stackoverflow:
    # things added first will be covered by things added last
    # => since I want pictures to be in foreground, I will add them after others elements
    # you can change this if you want to add pictures
    for k, v in imgDict.items():
        new_slide.shapes.add_picture(k, v[0], v[1], v[2], v[3]) # Adding the picture again
        os.remove(k) # Removing the temp file it created
    new_slide.shapes.title.text = ' ' #todo it breaks if we delete this title box. We should find a way to delete it...
    return new_slide  # this returns the single slide so you can instantly work with it if you need to
555Russich commented 1 year ago

Just to clarify for others. I tried most of solutions from this issue and stackoverflow , but no one of them duplicating charts in correct way. Code by @Mike3285 and @lthamm did good job except charts.

Workaround for me: Make by hand .pptx with a lot of template slides, fill needed templates and delete unused slides

Dasc3er commented 1 year ago

Hi all, I created a small set of utilities to clone a chart completely, which work for my simple use cases. These functions read the PPTX XML structure to replicate the styling and colors. As of now I have yet to find issues, but I am happy to improve on them. GIST with the aggregated utilities: https://gist.github.com/Dasc3er/2af5069afb728c39d54434cb28a1dbb8

This is the way to use them:

dest = slide.shapes
graphical_frame = # pptx object Graphical Frame containing the chart
result = clone_chart(graphical_frame, dest)

And these are the utilities:

from typing import Union

import pandas as pd

def chart_to_dataframe(graphical_frame) -> pd.DataFrame:
    """
    Helper to parse chart data to a DataFrame.

    :source: https://openpyxl.readthedocs.io/en/stable/pandas.html

    :param graphical_frame:
    :return:
    """
    from openpyxl import load_workbook

    from io import BytesIO
    wb = load_workbook(BytesIO(graphical_frame.chart.part.chart_workbook.xlsx_part.blob), read_only=True)

    ws = wb.active

    from itertools import islice
    import pandas as pd
    data = ws.values
    cols = next(data)[1:]
    data = list(data)
    idx = [r[0] for r in data]
    data = (islice(r, 1, None) for r in data)
    df = pd.DataFrame(data, index=idx, columns=cols)

    return df

def dataframe_to_chart_data(df):
    """
    Transforms a DataFrame to a CategoryChartData for PPT compilation.

    The indexes of the DataFrame are the categories, with each column becoming a series.

    :param df:
    :return:
    """
    from pptx.chart.data import CategoryChartData
    import numpy as np

    copy_data = CategoryChartData()
    copy_data.categories = df.index.astype(str).to_list()

    edge_cases = 0
    for c in df.columns:
        series_data = df[c].copy()
        fixed_series_data = series_data.replace([np.inf, -np.inf, np.nan], None)

        edge_cases = edge_cases + np.count_nonzero(fixed_series_data != series_data)

        copy_data.add_series(str(c), fixed_series_data.to_list())

    # Warning over data filled for compatibility
    if edge_cases > 0:
        import warnings
        warnings.warn("Series data containing NaN/INF values: filled to empty")

    return copy_data

def clone_chart(graphical_frame, dest):
    """
    Helper to clone a chart with related styling.

    :param graphical_frame:
    :param dest:
    :return:
    """
    chart = graphical_frame.chart

    df = chart_to_dataframe(graphical_frame)
    chart_data = dataframe_to_chart_data(df)

    new_chart = dest.shapes.add_chart(
        chart.chart_type,
        graphical_frame.left,
        graphical_frame.top,
        graphical_frame.width,
        graphical_frame.height,
        chart_data
    )

    # Fix offset for Graphical shape
    import copy
    cur_el = new_chart._element.xpath(".//p:nvGraphicFramePr")[0]
    ref_el = graphical_frame._element.xpath(".//p:nvGraphicFramePr")[0]
    parent = cur_el.getparent()
    parent.insert(
        parent.index(cur_el) + 1,
        copy.deepcopy(ref_el)
    )
    parent.remove(cur_el)

    # Clone styling from old chart to new one
    from random import randrange
    from lxml import etree
    from pptx.oxml import parse_xml

    id_attribute = '{http://schemas.openxmlformats.org/officeDocument/2006/relationships}id'

    old_chart_ref_id = graphical_frame.element.xpath(".//c:chart")[0].attrib[id_attribute]
    chart_ref_id = new_chart.element.xpath(".//c:chart")[0].attrib[id_attribute]

    new_chart_part = new_chart.part.rels._rels[chart_ref_id].target_part
    old_chart_part = graphical_frame.part.rels._rels[old_chart_ref_id].target_part

    chart_data_reference_id = new_chart_part._element.xpath(".//c:externalData")[0].attrib[id_attribute]

    cloned_styling = copy.deepcopy(old_chart_part._element)
    cloned_styling.xpath(".//c:externalData")[0].set(id_attribute, chart_data_reference_id)
    cloned_styling.xpath(".//c:autoUpdate")[0].set("val", "1")
    new_chart_part.part._element = cloned_styling

    # Parse other relationships of the chart
    from pptx.opc.constants import CONTENT_TYPE as CT, RELATIONSHIP_TYPE as RT
    from pptx.opc.package import XmlPart

    class ColorsPart(XmlPart):
        partname_template = "/ppt/charts/colors%d.xml"

        @classmethod
        def new(cls, package, element):
            part = cls.load(
                package.next_partname(cls.partname_template),
                CT.OFC_CHART_COLORS,
                package,
                element,
            )
            return part

    class StylePart(XmlPart):
        partname_template = "/ppt/charts/style%d.xml"

        @classmethod
        def new(cls, package, element):
            part = cls.load(
                package.next_partname(cls.partname_template),
                CT.OFC_CHART_STYLE,
                package,
                element,
            )
            return part

    new_chart_refs = new_chart_part.rels
    old_chart_refs = old_chart_part.rels

    # Fix styling and colors applied to the new chart
    for k, v in dict(old_chart_refs._rels).items():
        if v.reltype == 'http://schemas.microsoft.com/office/2011/relationships/chartStyle':
            targ = v.target_part

            new_el = parse_xml(copy.deepcopy(targ.blob))
            new_el.set("id", str(randrange(10 ** 5, 10 ** 9)))
            new_colors_ref = StylePart.new(targ.package, etree.tostring(new_el))
            new_chart_refs.get_or_add("http://schemas.microsoft.com/office/2011/relationships/chartStyle",
                                      new_colors_ref)
        elif v.reltype == RT.CHART_COLOR_STYLE:
            targ = v.target_part

            new_el = parse_xml(copy.deepcopy(targ.blob))
            new_el.set("id", str(randrange(10 ** 5, 10 ** 9)))
            new_colors_ref = ColorsPart.new(targ.package, etree.tostring(new_el))
            new_chart_refs.get_or_add(RT.CHART_COLOR_STYLE, new_colors_ref)

    return new_chart

Edit: fixes for python-pptx 0.6.22

Dasc3er commented 1 year ago

I also share the utilities I am using to duplicate a slide, applying the previous chart cloning functions. These include a fix for creating a new slide, as internal file names could sometimes conflict when creating a new slide.

I do not ensure this works for all use cases. GIST with the aggregated utilities: https://gist.github.com/Dasc3er/2af5069afb728c39d54434cb28a1dbb8

This is the way to use them:

new_slide = duplicate_slide(ppt, 1)

And these are the utilities:

def _object_rels(obj):
    rels = obj.rels

    # Change required for python-pptx 0.6.22
    check_rels_content = [k for k in rels]
    if isinstance(check_rels_content.pop(), str):
        return [v for k, v in rels.items()]
    else:
        return [k for k in rels]

def _exp_add_slide(ppt, slide_layout):
    """
    Function to handle slide creation in the Presentation, to avoid issues caused by default implementation.

    :param slide_layout:
    :return:
    """

    def generate_slide_partname(self):
        """Return |PackURI| instance containing next available slide partname."""
        from pptx.opc.packuri import PackURI

        sldIdLst = self._element.get_or_add_sldIdLst()

        existing_rels = [k.target_partname for k in _object_rels(self)]
        partname_str = "/ppt/slides/slide%d.xml" % (len(sldIdLst) + 1)

        while partname_str in existing_rels:
            import random
            import string

            random_part = ''.join(random.choice(string.ascii_letters) for i in range(2))
            partname_str = "/ppt/slides/slide%s%d.xml" % (random_part, len(sldIdLst) + 1)

        return PackURI(partname_str)

    def add_slide_part(self, slide_layout):
        """
        Return an (rId, slide) pair of a newly created blank slide that
        inherits appearance from *slide_layout*.
        """
        from pptx.opc.constants import RELATIONSHIP_TYPE as RT
        from pptx.parts.slide import SlidePart

        partname = generate_slide_partname(self)
        slide_layout_part = slide_layout.part
        slide_part = SlidePart.new(partname, self.package, slide_layout_part)
        rId = self.relate_to(slide_part, RT.SLIDE)
        return rId, slide_part.slide

    def add_slide_ppt(self, slide_layout):
        rId, slide = add_slide_part(self.part, slide_layout)
        slide.shapes.clone_layout_placeholders(slide_layout)
        self._sldIdLst.add_sldId(rId)
        return slide

    # slide_layout = self.get_master_slide_layout(slide_layout)
    return add_slide_ppt(ppt.slides, slide_layout)

def copy_shapes(source, dest):
    """
    Helper to copy shapes handling edge cases.

    :param source:
    :param dest:
    :return:
    """
    from pptx.shapes.group import GroupShape
    import copy

    # Copy all existing shapes
    for shape in source:
        if isinstance(shape, GroupShape):
            group = dest.shapes.add_group_shape()
            group.name = shape.name
            group.left = shape.left
            group.top = shape.top
            group.width = shape.width
            group.height = shape.height
            group.rotation = shape.rotation

            # Recursive copy of contents
            copy_shapes(shape.shapes, group)

            # Fix offset
            cur_el = group._element.xpath(".//p:grpSpPr")[0]
            ref_el = shape._element.xpath(".//p:grpSpPr")[0]
            parent = cur_el.getparent()
            parent.insert(
                parent.index(cur_el) + 1,
                copy.deepcopy(ref_el)
            )
            parent.remove(cur_el)

            result = group
        elif hasattr(shape, "image"):
            import io

            # Get image contents
            content = io.BytesIO(shape.image.blob)
            result = dest.shapes.add_picture(
                content, shape.left, shape.top, shape.width, shape.height
            )
            result.name = shape.name
            result.crop_left = shape.crop_left
            result.crop_right = shape.crop_right
            result.crop_top = shape.crop_top
            result.crop_bottom = shape.crop_bottom
        elif hasattr(shape, "has_chart") and shape.has_chart:
            from .charts import clone_chart
            result = clone_chart(shape, dest)
        else:
            import copy

            newel = copy.deepcopy(shape.element)
            dest.shapes._spTree.insert_element_before(newel, "p:extLst")
            result = dest.shapes[-1]

def duplicate_slide(ppt, slide_index: int):
    """
    Duplicate the slide with the given number in presentation.
    Adds the new slide by default at the end of the presentation.

    :param ppt:
    :param slide_index: Slide number
    :return:
    """
    source = ppt.slides[slide_index]

    dest = _exp_add_slide(ppt, source.slide_layout)

    # Remove all shapes from the default layout
    for shape in dest.shapes:
        remove_shape(shape)

    # Copy all existing shapes
    copy_shapes(source.shapes, dest, is_duplication=True)

    # Copy all existing shapes
    if source.has_notes_slide:
        txt = source.notes_slide.notes_text_frame.text
        dest.notes_slide.notes_text_frame.text = txt

    return dest

def remove_shape(shape):
    """
    Helper to remove a specific shape.

    :source: https://stackoverflow.com/questions/64700638/is-there-a-way-to-delete-a-shape-with-python-pptx

    :param shape:
    :return:
    """
    el = shape.element  # --- get reference to XML element for shape
    el.getparent().remove(el)  # --- remove that shape element from its tree

Edit: fixes for python-pptx 0.6.22

MartinPacker commented 1 year ago

Nice @Dasc3er. You're lucky that "all" you had to do is manipulate XML. If you had to manipulate other parts it would've been tough.

It occurs to me an open source project that adds a "companion module" to python-pptx could work. python-pptx need then only be fixed to support eg new Python releases.

KhaledTETAH commented 1 year ago

Hello, i found issues when trying to copy some element from slide to slide using the code bellow, i think this function can not support the copy of elements of type in the Open XML of the file. Any suggestions please !

`def _get_blank_slide_layout(pres): layout_items_count = [len(layout.placeholders) for layout in pres.slide_layouts] min_items = min(layout_items_count) blank_layout_id = layout_items_count.index(min_items) return pres.slide_layouts[blank_layout_id]

def duplicate_slide(pres, index): """Duplicate the slide with the given index in pres.

Adds slide to the end of the presentation"""
source = pres.slides[index]

blank_slide_layout = _get_blank_slide_layout(pres)
dest = pres.slides.add_slide(blank_slide_layout)

for shp in source.shapes:
    el = shp.element
    newel = copy.deepcopy(el)
    dest.shapes._spTree.insert_element_before(newel, 'p:extLst')

for key, value in six.iteritems(source.rels):
    # Make sure we don't copy a notesSlide relation as that won't exist
    if not "notesSlide" in value.reltype:
        dest.rels.add_relationship(value.reltype, value._target, value.rId)

return dest`
Rohith-Prem commented 11 months ago

I had to machine translate @Lirioooo response, but it seemed to suggest checking for .xml file endings in the target ref to avoid a corrupted file when opening a modified file. This did not work for me. I unzipped the output file and logically the slide relations don't contain the .xml relations - yet those are required for e.g. the chart to work. My current solution is to downgrade to version 0.6.19 and then the answer suggested solution should work.

With the current release (version 0.6.21) i did not come up with a working solution. I managed to adapt some parts, but the output is still a broken file. Here is my current try:

def pptx_copy_slide(pres: pptx.Presentation, source: pptx.slide.Slide):
    dest = pres.slides.add_slide(source.slide_layout)
    for shape in dest.shapes:
        shape.element.getparent().remove(shape.element)

    for shape in source.shapes:
        new_shape = copy.deepcopy(shape.element)
        dest.shapes._spTree.insert_element_before(new_shape, 'p:extLst')

    for rel in source.part.rels:
        target = rel._target

        if "notesSlide" in rel.reltype:
            continue

        if 'chart' in rel.reltype:
            # https://github.com/scanny/python-pptx/issues/132#issuecomment-414001942
            partname = target.package.next_partname(pptx.parts.chart.ChartPart.partname_template)
            xlsx_blob = target.chart_workbook.xlsx_part.blob
            target = pptx.parts.chart.ChartPart(
                partname = partname, 
                content_type = target.content_type, 
                element = copy.deepcopy(target._element),
                package=target.package)
            target.chart_workbook.xlsx_part = pptx.parts.chart.EmbeddedXlsxPart.new(
                blob=xlsx_blob, 
                package=target.package)

        if rel.is_external:
            dest.part.rels.get_or_add_ext_rel(rel.reltype, rel._target)
        else:
            dest.part.rels.get_or_add(rel.reltype, rel._target)

    return dest

Ideas why this is not working would be appreciated.

Honestly I think a different approach had some benefits, when you unzip the powerpoint file, duplicating a slide can be done fairly easily I think:

  1. Duplicate the slide file and rename it
  2. Duplicate the relation file and rename it
  3. Link the slide in the presentation relations ("rels/presentation.xml.rels") This would be fairly easy to do in python and you would account for all cases (e.g. no issues with charts or diagrams). The only problem is, that you would need to create a temp file and read it in again.

Have you had any success with the proposed approach? @Ithamm I still cant get a clean file and duplicating the same slide multiple times is just not working. :(

Mike3285 commented 9 months ago

The script I made was breaking if the presentation had hyperlinks. New and improved version: https://gist.github.com/Mike3285/a07978387c02f313ee39be665b9d44eb

Sample usage is on the bottom, just give it a list of paths where the presentations are and it will merge them all into one

rajeshm71 commented 9 months ago

Given solution works for copying slides from single file. It does not work when I am trying to copy slides from multiple files as it does not copy slide layouts from multiple files. It copies slide layouts only when I load a particular file in output presentation and I am able to load only a single file in output presentation. Is there any way to copy slides from multiple files including it's slide layouts and master styles.

Here is my version of code.

def merge_presentations_list(path, outputPres, slide_list):
"""
Merge a list of PowerPoint presentations into a single presentation.
"""
templatePres = Presentation(path)
for i in range(len(templatePres.slides)):
if i in slide_list:
move_slide(templatePres, i, outputPres)
return outputPres

def move_slide(copyFromPres, slideIndex, pasteIntoPres):
"""
Copy a slide from one presentation to another.
"""
slide_to_copy = copyFromPres.slides[slideIndex]
slide_layout = slide_to_copy.slide_layout #This has no effect if I load empty presentation in output
new_slide = pasteIntoPres.slides.add_slide(slide_layout)

for shp in slide_to_copy.shapes:
    if Picture in shp.name:
        with open(shp.name + '.jpg', 'wb') as f:
            f.write(shp.image.blob)
        new_slide.shapes.add_picture(shp.name + '.jpg', shp.left, shp.top, shp.width, shp.height)
        os.remove(shp.name + '.jpg')
    else:
        print(shp.name)
        el = shp.element
        newel = copy.deepcopy(el)
        new_slide.shapes._spTree.insert_element_before(newel, 'p:extLst')

return new_slide'
def create_strawman_presentation(filenames, slidelists, output_root=None):
"""
Create a final PowerPoint presentation based on specified criteria.
"""
final_name = 'sample_ppt.pptx'
first_file = filenames[0]

#This is where I am reading first file in output presentation.
# I am able to copy all slide layouts of filtered slides from source file in output presentation.
# I need a way to copy slide with slide layouts with master styles from all the files.
outputPres = Presentation(first_file)

for i in range(filenames):
    source_file = filenames[i]
    slide_list = slidelists[i]
    outputPres = merge_presentations_list(path=source_file, outputPres=outputPres, slide_list=slide_list)

outputPres.save(os.path.join(output_root, final_name))
return final_name
Mike3285 commented 9 months ago

Given solution works for copying slides from single file. It does not work when I am trying to copy slides from multiple files as it does not copy slide layouts from multiple files. It copies slide layouts only when I load a particular file in output presentation and I am able to load only a single file in output presentation. Is there any way to copy slides from multiple files including it's slide layouts and master styles.

Did you even check my post? Does exactely what you ask

rajeshm71 commented 9 months ago

Given solution works for copying slides from single file. It does not work when I am trying to copy slides from multiple files as it does not copy slide layouts from multiple files. It copies slide layouts only when I load a particular file in output presentation and I am able to load only a single file in output presentation. Is there any way to copy slides from multiple files including it's slide layouts and master styles.

Did you even check my post? Does exactely what you ask

Yes I have used your code. It was not working as expected for me so I made some modifications in the code as given above. I suppose your code copies content of the slides without copying slide layouts and styles. It loads a single file in output presentation and applies styles and layouts of that presentation to all other presentation. I would like to copy slides with slide layouts and styles.

mszbot commented 8 months ago

Office.js provides this functionality. As does powerpointgeneratorapi.com.

lasagar commented 6 months ago

@rajeshm71 I am trying to copy a slide from template to new slide. Can you share the git link of your code.

Mike3285 commented 5 days ago

Hello everyone, I have a new version of the function working on presentations with images, text and even tables. I still have not used it in presentation with graphs, maybe you could test it.

Here is my code:

def merge_presentations_list(paths: list, final_name: str = None):
    outputPres = Presentation(paths[0])

    for path in paths[1:]:
        templatePres = Presentation(path)
        for i in range(len(templatePres.slides)):  # We iterate over all the slides
            move_slide(templatePres, i, outputPres)

    if final_name:
        outputPres.save(final_name)
    return outputPres
def get_links_from_table(table):
    """This function saves to a dict in memory the data about table cells for later retrieval
        Since the cells are always read from left to right and top to bottom, we can use enumerate to use their index as an ID
        So we associate each hyperlink with the Idx of the cell where we found it
    """
    links = {}
    for idx, cell in enumerate(table.iter_cells()):
        if hasattr(cell, 'text_frame'):
            for paragraph in cell.text_frame.paragraphs:
                p = paragraph._p
                if hasattr(paragraph, 'runs'):
                    r = paragraph.runs[0]
                    if r.hyperlink.address:
                        links.update({idx: r.hyperlink.address})
    return links

def rewrite_hyperlinks_in_table(shape, table_hlink_data):
    """
    Rewrites hyperlinks in the table cells of a shape.

    Parameters:
    shape (Shape): The shape containing a table whose hyperlinks need to be restored.
    table_hlink_data (dict): A dictionary containing hyperlink data for table cells. 
                             The keys correspond to cell indices, and the values are the hyperlink addresses.

    Note: This function matches the cell index with stored hyperlink data and re-applies the hyperlinks to the cells.
    """
    if shape.name in table_hlink_data:
        data = table_hlink_data[shape.name]
        for idx, cell in enumerate(shape.table.iter_cells()):
            if hasattr(cell, 'text_frame'):
                for paragraph in cell.text_frame.paragraphs:
                    if hasattr(paragraph, 'runs'):
                        for key in data:
                            # Re-assign the hyperlink if the cell index matches the saved data
                            if idx == key:
                                paragraph.runs[0].hyperlink._add_hlinkClick(data[idx])

def move_slide(copyFromPres: Presentation, slideIndex: int, pasteIntoPres: Presentation):
    """
    Copies a slide from one presentation to another.

    Parameters:
    copyFromPres (Presentation): The source presentation to copy from.
    slideIndex (int): The index of the slide to be copied from the source presentation.
    pasteIntoPres (Presentation): The target presentation to paste the copied slide into.

    Returns:
    Slide: The newly created slide in the target presentation.

    Note: The layout of the new slide in the target presentation is based on the first available layout. 
    The function handles copying text, images, and hyperlinks. Special handling is required for tables and
    their hyperlinks, which are reconstructed separately.
    """

    # Copy the slide at the specified index
    slide_to_copy = copyFromPres.slides[slideIndex]

    # Use the first available layout in the target presentation
    slide_layout = pasteIntoPres.slide_layouts[0]

    # Create a new slide in the target presentation with the selected layout
    new_slide = pasteIntoPres.slides.add_slide(slide_layout)

    # Dictionary to store image data for re-adding after other elements
    imgDict = {}

    # Variables for handling hyperlinks
    haddress = None
    shape_w_hyperlink_id = None
    table_hlink_data = {}

    # Iterate through all shapes in the original slide
    for shp in slide_to_copy.shapes:

        # Handle images by saving them temporarily to avoid corrupt files
        if 'Picture' in shp.name:
            with open(shp.name + '.jpg', 'wb') as f:
                f.write(shp.image.blob)
            imgDict[shp.name + '.jpg'] = [shp.left, shp.top, shp.width, shp.height]

        else:
            # Copy the shape element
            el = shp.element

            # Handle text and potential hyperlinks in text
            if hasattr(shp, 'text_frame'):
                text_frame = shp.text_frame
                p = text_frame.paragraphs[0]

                # Check if the shape contains a hyperlink
                if p.runs:
                    r = p.runs[0]
                    shape_w_hyperlink_id = shp.shape_id
                    haddress = r.hyperlink.address

        # Handle tables and store their hyperlink data for later reconstruction
        if shp.has_table:
            table_hlink_data.update({shp.name: get_links_from_table(shp.table)})

        # Insert the copied shape into the new slide
        new_slide.shapes._spTree.insert_element_before(el, 'p:extLst')

        # Re-assign the hyperlink to the copied shape if it exists
        if haddress:
            for shape in new_slide.shapes:
                if shape.shape_id == shape_w_hyperlink_id:
                    try:
                        shape.text_frame.paragraphs[0].runs[0].hyperlink._add_hlinkClick(haddress)
                    except Exception as e:
                        message = f"Failed to move the hyperlink with id {shape_w_hyperlink_id}: {e}"
                        print(message)

    # Handle table hyperlinks by reconstructing them in the copied slide
    for shape in new_slide.shapes:
        if shape.has_table:
            rewrite_hyperlinks_in_table(shape, table_hlink_data)

    # Add images back to the new slide after copying all other elements
    for k, v in imgDict.items():
        new_slide.shapes.add_picture(k, v[0], v[1], v[2], v[3])
        os.remove(k)  # Remove the temporary image file

    # Handle potential issues with the title shape
    try:
        new_slide.shapes.title.text = ' '
    except Exception as e:
        print(f"Warning, unable to set title text: {e}")

    return new_slide