scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.33k stars 510 forks source link

Placeholder: Support content placeholders #333

Open mcdevitts opened 6 years ago

mcdevitts commented 6 years ago

PowerPoint has a Content placeholder that allows a user to insert pictures, charts, etc. Currently, python-pptx does not seem to recognize these placeholders and just uses the generic SlidePlaceholder. This SlidePlaceholder does not support the insertion of pictures, charts, etc.

A cursory glance and trial looks like simply adding the methods from ChartPlaceholder, PicturePlaceholder, and TablePlaceholder to SlidePlaceholder enables the desired functionality on Content placeholders.

mcdevitts commented 6 years ago

For what it's worth, I got this working in my fork of python-pptx located here adding this code.

I'm not familiar enough with python-pptx or the pptx format in general to be able to write the necessary documentation or test cases to validate this. If any one is up for helping, I'd love to be able to integrate this capability into python-pptx.

mszbot commented 5 years ago

@mcdevitts how far did you get with this functionality?

mcdevitts commented 5 years ago

I implemented it in my fork. It's rather simple and seems to work quite well (for images at least).

I haven't submitted a pull request because I don't have the necessary documentation or test cases that @scanny requires. I also don't really know how to get started with that. If anyone wants to help, I'd love to get this into the main branch of python-pptx.

For my own usage, I just always monkey patch python-pptx (example).

bersbersbers commented 4 years ago

The monkey patch works nicely for me, and I would like to see it integrated into the code base, so that I won't have to maintain a duplicate portion of code,

mcdevitts commented 4 years ago

@bersbersbers I'd love to integrate it. However, the last time I looked into this, I was intimidated by the necessary unittests and requisite knowledge of the underlying powerpoint format. I would need substantial help to do this.

bersbersbers commented 4 years ago

I probably won't be able to help with that. Maybe, though, it helps to know that I was able to boil it down to this, and it seems to be working in my project:

from pptx.shapes.shapetree import PicturePlaceholder, SlidePlaceholder
SlidePlaceholder.insert_picture = PicturePlaceholder.insert_picture
SlidePlaceholder._new_placeholder_pic = PicturePlaceholder._new_placeholder_pic
SlidePlaceholder._get_or_add_image = PicturePlaceholder._get_or_add_image

So at least there is no need to duplicate code, and maybe less need to unit-test things when using a similar approach in the code base directly.

sleepyhollo commented 1 year ago

@mcdevitts are you still interested in doing this?

My current workaround is manually instantiating the type of Placeholder I want from the SlidePlaceholder object. I have only it for tables.

content = slide1.placeholders[1]
table_place = TablePlaceholder(content._sp, content._parent)
graphic_frame = table_place.insert_table(rows=4, cols=4)
table = graphic_frame.table
npiper commented 5 months ago

Ran into this by another utility that uses this library, pptx2md

Is there a way to 'hack' the source slide to remove the content placholders? Is it related to the slide master?

Error: File "/usr/local/lib/python3.9/site-packages/pptx/shapes/base.py", line 153, in placeholder_format raise ValueError("shape is not a placeholder") ValueError: shape is not a placeholder

MartinPacker commented 5 months ago

Place holders must be in the slide master. I don't see how you can remove them in python-pptx. I'm guessing you want to know how to remove them from any export you might do.

scanny commented 5 months ago

While placeholders can live in the slide master, they are predominantly found in a slide-layout. Each slide layout can "inherit" certain placeholders from the slide master (at slide-layout creation time), I think the title and footer placeholders, and perhaps usually a single body placeholder. But in general the diversity of placeholders in slide layouts is much higher.

The placeholders in a slide layout are "prototypes" in the sense that they are copied to the slide when a slide is created from a layout. Any change of position made to the placeholder in the slide layout will not cause corresponding placeholders in existing slides to change and vice versa. I do believe certain other characteristics are inherited though, if not explicitly changed on the slide placeholder, font maybe, paragraph properties, etc. but it's been a long time. I expect the design documentation has some useful insight into that.

In any case, this means there are three kinds of "deleting a placeholder":

  1. Deleting a slide shape copied from a slide-layout when the slide was created (slide placeholder).
  2. Deleting a placeholder from a slide-layout (such that new slides created from that layout do not get a corresponding shape).
  3. Deleting a placeholder from the slide-master.

In general, our design assumption was that 2 and 3 would be done by hand using PowerPoint (or whatever, LibreOffice etc.). We justify this by saying that programmatically creating PPTX templates is not a target use-case for python-pptx.

Anyway, bottom line is that if you want to delete a placeholder of type 1 then just delete the shape and your done. You have to do that yourself because there is no Shapes.remove(shape) method or whatever we might name such a callable. A big part of the reason is that a placeholder can in general contain "links" to a great many other things, like images, hyperlinks, charts, SmartArt, etc. So while a "simple-case" solution would be straightforward, a general-case solution not so much.

npiper commented 5 months ago

Place holders must be in the slide master. I don't see how you can remove them in python-pptx. I'm guessing you want to know how to remove them from any export you might do.

Use case I am looking at is converting slides from PPTX (Best effort) to export formats that could go into more developer friendly / maintainable doco (PPTX --> best effort of making a Markdown, Asciidoc with extracted images)

Project that uses the library: https://github.com/ssine/pptx2md

Was seeing if there is a way to workaround this error by editing the source PPT to get past it or if this error can be 'ignored' if not material to reading the rest of the slide deck.

scanny commented 5 months ago

@npiper okay, I looked a little further to the error message I think you're getting.

So some part of the pptx2md code is checking whether a shape has a .placeholder_format attribute:

File "/usr/local/lib/python3.9/site-packages/pptx2md/parser.py", line 237, in parse
if hasattr(shape, "placeholder_format"):
File "/usr/local/lib/python3.9/site-packages/pptx/shapes/base.py", line 153, in placeholder_format
raise ValueError("shape is not a placeholder")

It seems like this is an attempt to identify placeholder shapes, but there is a shape.is_placeholder attribute for that job. Inspecting attributes like this is a dependence on implementation details which are not tested and subject to change without notice.

I haven't traced through your code to see why you're checking for the attribute but you should instead test on .is_placeholder and only attempt access to .placeholder_format (which hasattr() does) when .is_placeholder is True.

Also, this is unrelated to the original post for this issue so in future please start a new one :)