scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.37k stars 513 forks source link

Embed HTML into PPTx #777

Open vctsenthil opened 2 years ago

vctsenthil commented 2 years ago

Hi Scanny,

Is there a way to embed an HTML file into PPTx? If I pass the HTML to add_ole_object() method, it internally stores it as .bin file and I am not able to open it. And also, I am not sure what would be the PROG_ID for HTML. Any help/suggestion would be greatly appreciated.

-Senthil

MartinPacker commented 2 years ago

Would Markdown be a suitable alternative?

vctsenthil commented 2 years ago

Hi Martin, Thanks for the quick reply. My HTML contains the 3D content (WebGL data). I am not sure if it is possible to handle it thru MarkDown. All HTML5 supported browsers support the WebGL data. So I want it to be embedded into PPTx so that it can be viewed in a browser while going thru the Presentation. Hope it clarifies my requirement.

MartinPacker commented 2 years ago

Yes, it did clarify things. Thanks! And saves me from unnecessarily SPAMing you with an open source project of mine.

Delengowski commented 2 years ago

So PowerPoint obviously cannot render HTML, but I think you know that. The only way to get this to work is to enable a feature that I think Microsoft themselves deprecated, defaulted to off, and/or removed completely. It requires use of COM to basically add some sort of internet explorer object into the pptx file that will run and do the rendering. I believe this idea has been scrapped for a very long due to security concerns. In other words, you open a PowerPoint with this stuff embedded it automatically renders a webpage that has malicious javascript and all sorts of bad things happen.

Have you been able to embed the webpage through PowerPoint directly?

vctsenthil commented 2 years ago

Hello Melendowski, thanks for your response. I don't need the COM interface for HTML and don't want it to be opened inside PowerPoint. I just want to insert (embed) an HTML file in one of the slides and when I click it, it can be opened in the default browser. I am able to do it manually by drag-and-drop the HTML files into PowerPoint as shown in the screenshot below.

The Python pptx library 0.6.21 allows embedding the .docs, .xlsx, and pptx into a PPTx file. I would like to have a similar solution for embedding the HTML.

embed-html-pptx

MartinPacker commented 2 years ago

What is it about HTML you want? I'm assuming a .PNG rendering of an HTML page won't do it for you.

viseshrp commented 2 years ago

@MartinPacker I think my requirement is a little different from that of OP. I would like to embed HTML content inside the PPTX directly rather than link to a file that opens in a browser. For example:

<h5>My title</h5>
<p>My paragraph</p>

This must render a title and a paragraph inside a slide.

MartinPacker commented 2 years ago

I assume Markdown to PowerPoint isn't an adequate substitute.

viseshrp commented 2 years ago

I assume Markdown to PowerPoint isn't an adequate substitute.

Thanks for the quick response. I think I can live with that by converting HTML to Markdown and then add to PowerPoint. How would I go about adding MD using python pptx?

scanny commented 2 years ago

If you can do this by hand in PowerPoint using the Insert Object feature, you should be able to do it in python-pptx. You will need the OLE "prog-id" and an icon image file.

The prog-id can be discovered using shape.ole_format.prog_id on an embedded OLE shape.

Note that most embedded OLE objects only work on Windows, and if you can't add and open such a shape by hand on Windows then using python-pptx for the embedding is not going to change that.

MartinPacker commented 2 years ago

@viseshrp converting from HTML to Markdown might be problematic because my open source project - md2pptx - I was alluding to only does a subset of Markdown (and certainly doesn't handle most HTML embedded in it).

If the HTML is very simple it might well work. But arbitrary HTML no. (And the relevance of offering md2pptx here is that it is built on python-pptx.)

I didn't / don't know the source of your HTML. If you were generating it that would be different; We could have had a discussion about emitting Markdown instead - as a data stream with widespread uses.

viseshrp commented 2 years ago

I'll take a look. Thank you very much! @MartinPacker