scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.28k stars 503 forks source link

text direction: right to left #851

Closed eladlavi closed 1 year ago

eladlavi commented 1 year ago

Within power point, under "paragraph", it is possible to set "Right-to-Left" text direction. How can it be done with python pptx? It isn't text alignment. Text alignment is something else. So if my presentation contains Hebrew then the text gets messed up. Thanks.

eladlavi commented 1 year ago

image

to be even more clear, I am attaching a screen shot from Power Point that emphasize the functionality that I'm looking in to controlling from python-pptx. does anyone have any idea how to set this ? is this a missing feature ?

MartinPacker commented 1 year ago

The usual "what does the XML look like?" question applies.

I don't use RTL myself but I might be able to provide sample code - if I knew what the XML looked like.

(As just another user of python-pptx and with very little spare time at the moment I can't promise anything, I'm afraid.)

eladlavi commented 1 year ago

I will appreciate it a lot. Can you provide information about working with XML so I can investigate it myself ? the Power Point file isn't an XML file obviously.

MartinPacker commented 1 year ago

It's possibly beyond your skill set but:

  1. Create a pptx with RTL text in.
  2. Change its fie extension to .zip.
  3. Unzip it.
  4. The appropriate part is probably a slidex.xml file. So throw these files into a text editor.
  5. Examine them for hints of how the RTL was done.
eladlavi commented 1 year ago

here I created a one slide presentation, with title and subtitle, the title is "right to left" and the subtitle is not. I can see this: in slide1.xml right next to the title tag. this is probably it. I am attaching the whole XML content. but now that we see how rtl is represented, what change is required to the code to apply this let's say to a text box or to a paragraph or to a title ?

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

כותרת ראשיתSome subtitle
eladlavi commented 1 year ago

I examined the difference in more details comparing to the none RTL content. so the tag: comes right after the tag and before the tag. Is it possible to conclude what needs to be done from code in order to set this ?

eladlavi commented 1 year ago

@MartinPacker is there a chance you can guide me through this? I think it is important feature that is missing.

bowespublishing commented 1 year ago

@eladlavi try this

import pptx
from lxml import etree
from pptx.oxml.xmlchemy import OxmlElement

def RtoL(shape):
    textb = shape.text_frame._txBody
    rId = textb.xpath('./a:p')[0]

    for bad in textb.xpath('./a:p/a:pPr'):
        bad.getparent().remove(bad)

    jj = OxmlElement("a:pPr")
    jj.set("algn","r")
    jj.set("rtl","1")
    rId.insert(0,jj)

prs = pptx.Presentation("test.pptx")
for slide in prs.slides:
    for shape in slide.shapes:
        if shape.has_text_frame:
            RtoL(shape)

prs.save('text2.pptx')
eladlavi commented 1 year ago

ok thanks, so I checked it, and it is almost good. I does change to RTL, so it works. but what's missing is two things: If I have more than one paragraph in a text frame, it only affects the first one, i.e if I did: shapes.title.text_frame.add_paragraph() your code affects only the first paragraph of the title. second, the text frame loses it's design, so if I set a font size for example, it changes back to some default.

any idea how to improve your code to do that ?

eladlavi commented 1 year ago

I was able to improve your code with the first issue, of not handling all paragraphs in a text frame:

def RtoL(shape): textb = shape.text_frame._txBody for bad in textb.xpath('./a:p/a:pPr'): bad.getparent().remove(bad) for rId in textb.xpath('./a:p'): jj = OxmlElement("a:pPr") jj.set("algn", "r") jj.set("rtl", "1") rId.insert(0, jj)

I don't fully understand how it works with the XML but I kind of figured it out. now I just need to handle the lost of design. I can guess it is caused by the code you wrote to "remove bad" so you "remove the bad" and insert a "good" one instead, but I guess in the removed XML element, it also stored the design. by "design" I mean font size etc..

eladlavi commented 1 year ago

ok, I ended up with this simple code:

def RtoL(shape): textb = shape.text_frame._txBody for bad in textb.xpath('./a:p/a:pPr'): bad.set("rtl", "1")

so to compare to your code, instead of removing the element and then inserting a new one, I just set the RTL to the existing element and it works! so first, it handles multi-paragraph, second, it keeps the existing design. I did remove the alignment line because this can be handles nicely already. I just ask the developers to add support for this feature built in the library, sense it solved that easily.

What is unfortunate is that this method forces you to save the pptx file and then, reopen it and re-save it and it doesn't allow some paragraph to be RTL and some LTR. Thanks.

MartinPacker commented 1 year ago

So what you're saying is there is some rendering action Powerpoint itself needs to take? I somehow doubt that is absolutely necessary. Perhaps Powerpoint is undertaking a repair action and (for once) not telling you.

Can you save the two versions of the .pptx and compare them?

bowespublishing commented 1 year ago

ok, I ended up with this simple code:

def RtoL(shape): textb = shape.text_frame._txBody for bad in textb.xpath('./a:p/a:pPr'): bad.set("rtl", "1")

so to compare to your code, instead of removing the element and then inserting a new one, I just set the RTL to the existing element and it works! so first, it handles multi-paragraph, second, it keeps the existing design. I did remove the alignment line because this can be handles nicely already. I just ask the developers to add support for this feature built in the library, sense it solved that easily.

What is unfortunate is that this method forces you to save the pptx file and then, reopen it and re-save it and it doesn't allow some paragraph to be RTL and some LTR. Thanks.

Use something like - https://github.com/davecra/OpenXmlFileViewer

Compare the XML and edit as appropriate to get the result you desire