scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.44k stars 528 forks source link

Hyperlinks to other parts of the powerpoint break #733

Closed NathanTech7713 closed 3 years ago

NathanTech7713 commented 3 years ago

Hi, When a hyperlink is in a shape that links to another part of the powerpoint, I.E, another slide, you can't actually access that link through the pptx package as it produces the following traceback:

v.address Traceback (most recent call last): File "", line 1, in File "C:\python332\lib\site-packages\pptx\text\text.py", line 459, in address return self.part.target_ref(self._hlinkClick.rId) File "C:\python332\lib\site-packages\pptx\opc\package.py", line 323, in target_ref rel = self.rels[rId] KeyError: ''

For reference, I have attached the powerpoint I am using. The first link in slide 2 works, the rest do not. links powerpoint.pptx

scanny commented 3 years ago

Show the code you used that resulted in this error, and also the code you used to create it. A minimal reproducible example. Looks like it should fit into less than a dozen lines.

NathanTech7713 commented 3 years ago

As Requested:

>>> from pptx import Presentation
>>> prs = Presentation("links-powerpoint.pptx")
>>> slide = prs.slides[3]
>>> shape = slide.shapes[1]
>>> paragraph = shape.text_frame.paragraphs[0]
>>> run = paragraph.runs[0]
>>> run.hyperlink.address
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\python332\lib\site-packages\pptx\text\text.py", line 459, in address
    return self.part.target_ref(self._hlinkClick.rId)
  File "C:\python332\lib\site-packages\pptx\opc\package.py", line 323, in target_ref
    rel = self.rels[rId]
KeyError: ''
>>>
scanny commented 3 years ago

Okay, so how did the offending shape get created? Did it just come in on a PowerPoint file that you expect was created by PowerPoint? Or did you create it using python-pptx in a prior step?

NathanTech7713 commented 3 years ago

Hi There,

I created this shape myself using powerpoint. I added a new title and content slide, entered text into the title, went into the content shape and created a hyperlink from there. Hope that helps

scanny commented 3 years ago

Okay, so here's the problem and a possible solution or two.

The problem is that _Run.hyperlink returns a _Hyperlink object and those only work for actual, well, hyperlinks. The "intra-presentation" "jump" links like first-slide, slide 7, etc. operate in a different way. An actual hyperlink stores a URL in the .rels XML for the part (a slide-part in this case, probably slide4.xml.rels) and a "key" to that URL, like "rId7" is stored in the hyperlink element.

So when you want to access that URL, python-pptx gets the key ("rId7") and looks up that item in the .rels file. In this case, the r:id attribute has the value "" (the empty string) and hence the KeyError on the empty string.

Basically, when this functionality was added, the sponsor only needed "external" (URL) hyperlinks and so that's all that got implemented.

However, if you want full "Click-action" behaviors on a run, you can get it like this:

>>> from pptx.action import ActionSetting
>>> from pptx.enum.action import PP_ACTION

# --- run and shape come from context ---
>>> rPr = run._r.get_or_add_rPr()
>>> click_action = ActionSetting(rPr, shape)
>>> click_action.action
PP_ACTION.FIRST_SLIDE

More on how to use this is described in the documentation here:
https://python-pptx.readthedocs.io/en/latest/api/action.html

Basically, you check ActionSetting.action to see what it is (including PP_ACTION.NONE, which of course is usually the case for any given run). If and only if the reported action is PP_ACTION.HYPERLINK do you access the hyperlink to find out what URL it points to.

NathanTech7713 commented 3 years ago

Works like a charm. Thanks :)