Open WolfgangFahl opened 3 years ago
What displayed string in the PowerPoint UI are you figuring for the slide-name?
Guessing, I would imagine the name that appears when viewing slides in the slide sorter or outline mode. I suspect part of the problem is that PowerPoint does its best to display a meaningful name (in particular the slide title) without calling on users to actually name the slides, unless they want something different from the title.
The thing to do is compare the before and after renaming XML and see where PowerPoint lodges (and later retrieves for display) whatever "custom" name you give to a slide. I'm betting it's not in <p:cSld "name"="xyz"/>
which is the XML node one might reasonably expect is the "official" slide name.
@scanny thanks for looking into this. I come from the Apache POI library where the same discusson was done a few years ago and the functionality can be implemented as outlined in https://stackoverflow.com/questions/44174371/how-to-retrieve-pptx-slide-name-with-apache-poi.
// set slide name via POI and validate it
sl.getXmlObject().getCSld().setName("new name");
This made it into the Java library via https://svn.apache.org/viewvc?view=revision&revision=1831745
TIL that slides can actually have unique identifiers, which would be useful to track a slide even if it moves in the deck! I'm curious, @WolfgangFahl are you describing this as a "bug" since slide.name did not persist, or is it more of a "feature request" b/c slide.name does not actually exist as a property (yet) in python-pptx?
Hey @timcolson, long time no see :) Slide.name
is a read-write property, so that's not a problem as far as I know. The idea is that what appears in the PowerPoint UI as the "name" of the slide is not the value of that property. As I recall, by default it is the contents of the slide-title, which makes sense because folks mostly define titles, but asking them to define a name would likely be interpreted as an unneccesary pain for users.
Btw, there is a Slide.slide_id
property that allows discovery of the unique slide-id for each slide. That property is not writeable since it's value is arbitrary and guaranteed unique. You could theoretically change it, but any supplied value would have to be verified for uniqueness and isn't something we've seen a use case for.
Thanks for the details, @scanny. I have a use case where tracking the location of particular slides, wherever they may be moved in a deck, will be necessary. My first idea was to programmatically add a GUID metadata fingerprint in the speaker notes, but the slide "name" sounded like a good option.
After hearing about the slide_id, that may work better!
No intent to mutate the ID, nor have users even see it. Just want to be able to scan a deck and tell the user, "The slide you're looking for is #N" - where N might be 37 yesterday, but after modifications and new slides, it's #42 now.
No "Develop" menu on MacOS PPT OOTB, so loading Ofc now on my Win machine. I'm still curious about the "name" field, so will give it a test to see what changes in the XML. :)
The slide.name actually exists in the xml spec for powerpoint. Even Microsoft does not provide a UI for it. It is essential if you e.g. try to keep multi-language versions of Powerpoint presentations around as we do. The titles of the slide (and unfortunately even the pages) of the slides might differ but they should have the same content just in another language.
TL;DR: Looks like PPT slide name is indeed stored in
sudo apt install unzip
but also can just use 7zip w/ zip optionsudo apt install libxml2-utils
to pretty format XML, to make diff much less cluttered.Steps to recreate and determine the field used for slide name.
xmllint --format stepN/ppt/slides/slide1.xml > sN.xml
// adds newlines41c41
< <a:t>TIM TITLE1</a:t>
---
> <a:t>TIM TITLE2</a:t>
<p:cSld name="Slide1-TC">
> diff s1.xml s-named-manual.xml
3c3
< <p:cSld>
---
> <p:cSld name="S1-TC">
41c41
< <a:t>TIM TITLE1</a:t>
---
> <a:t>TIM TITLE Manual Named</a:t>
@timcolson you can accomplish changing Slide.name
easily from python-pptx
:
slide.name = "Slide1-TC"
I think the problem @WolfgangFahl was pointing at was that doing so doesn't make that new name appear in PowerPoint outline mode or the other UI places where slides appear "by name".
But if you just want a programmatic identifier for a slide, albeit not guaranteed unique, then Slide.name
is a good option.
Thx, Mr. @scanny. Obv new to this, still getting my bearings.
If I'm now understanding correctly, will python-pptx setting slide.name="Slide1-TC" result in
Please implement this as the apache poi library does. I think i am currently using a workaround but i am not sure since the discussion has been going on for so long already. I am definitely still working with slide names a lot.
I can confirm python-pptx definitely sets slide name, just as Steve said. (I never doubted! 👍 )
ppt.slides[1].name="Slide1-TC"
results in <p:cSld name="Slide1-TC">
in the saved PPTX
After reading the thread again, I realized I misunderstood the phrase, "I'm betting it's not in
I wrongly thought that meant this particular name data was not expected for the name attribute. I now believe this was intended as a troubleshooting suggestion for @WolfgangFahl to verify name data was actually written to the file. (Unzipping and viewing the attribute in the XML would confirm.)
Great learning experience for me. Also learned VS Code has an interactive Python Jyupter notebook capability. Cool!
Two more observations: 1) Retrieving slide_id as int was easy with python-PPTX, just as Steve said, but I was unsuccessful with Apache POI to retrieve the same. It exists, but private to the class. Perhaps the ID is considered an internal implementation detail by POI, so not exposed.
2) Differences in treatment of slides where name is not set:
@Override
public String getSlideName() {
final CTCommonSlideData cSld = getXmlObject().getCSld();
return cSld.isSetName() ? cSld.getName() : "Slide"+getSlideNumber();
}
@Override
public int getSlideNumber() {
int idx = getSlideShow().getSlides().indexOf(this);
return (idx == -1) ? idx : idx+1;
}
// Tim: no setSlideName() -- instead, must directly update XML object, like this:
// slide.getXmlObject().getCSld().setName("TC-JavaName");
Setting slide name does work in python-ppt, and is even nicer than the POI code.
I'm curious, is there still a need to make changes, Herr @WolfgangFahl?
After reporting the issue here i worked around the problem by using my old PowerPoint VBA macros for changing the slide names. I never retried to use the library in write mode. Reading the slide.name is and was no problem.
pip show python-pptx
Name: python-pptx
Version: 0.6.18
Summary: Generate and manipulate Open XML PowerPoint (.pptx) files
Home-page: http://github.com/scanny/python-pptx
Author: Steve Canny
Author-email: python-pptx@googlegroups.com
License: The MIT License (MIT)
Location: /Users/wf/Library/Python/3.8/lib/python/site-packages
Requires: lxml, Pillow, XlsxWriter
Required-by:
This seems to be in sync with https://pypi.org/project/python-pptx/ as of 2021-02-13.
Next time i am working on my code again i might look into the issue again. If i remember right it was not the only problem with the library - i believe some of my power point files were not re-saved correctly in other ways so i didn't dare to manipulate the files with the library but used powerpoints vba for this.
I verified slide.name read/write work as expected, I suggest @scanny close this issue.
Now I am curious, I've been aware that cSId(name=)
attributes are not guaranteed to be unique but then what does the winCOM API do when you index by name, which is completely allowed?
https://docs.microsoft.com/en-us/office/vba/api/powerpoint.slides.item
Interestingly, winCOM will throw an error if you do not keep the slide names unique when setting them.
In [1]: import win32com.client as win32
In [2]: pptx = win32.gencache.EnsureDispatch("PowerPoint.Application")
In [3]: pres = pptx.Presentations.Add(False)
In [4]: layout = pres.Designs(1).SlideMaster.CustomLayouts(1)
In [5]: pres.Slides.AddSlide(1, layout)
Out[5]: <win32com.gen_py.None.Slide>
In [6]: pres.Slides.AddSlide(2, layout)
Out[6]: <win32com.gen_py.None.Slide>
In [7]: pres.Slides(1).Name = "Test"
In [8]: pres.Slides(2).Name = "Test"
---------------------------------------------------------------------------
com_error Traceback (most recent call last)
<ipython-input-8-410b453c78da> in <module>
----> 1 pres.Slides(2).Name = "Test"
C:\ProgramData\Anaconda3\lib\site-packages\win32com\client\__init__.py in __setattr__(self, attr, value)
518 d=self.__dict__["_dispobj_"]
519 if d is not None:
--> 520 d.__setattr__(attr, value)
521 return
522 except AttributeError:
C:\ProgramData\Anaconda3\lib\site-packages\win32com\client\__init__.py in __setattr__(self, attr, value)
480 except KeyError:
481 raise AttributeError("'%s' object has no attribute '%s'" % (repr(self), attr))
--> 482 self._oleobj_.Invoke(*(args + (value,) + defArgs))
483 def _get_good_single_object_(self, obj, obUserName=None, resultCLSID=None):
484 return _get_good_single_object_(obj, obUserName, resultCLSID)
com_error: (-2147352567, 'Exception occurred.', (0, 'Microsoft PowerPoint', 'Slide.Name : Invalid request. Another slide already has this name.', '', 0, -2147188160), None)
Now I wonder if I follow the directions in OP to set the slide names manually, in PowerPoint, what will happen?
Thinking about it now, the answer makes sense. Since when you interact with the PowerPoint GUI, you are making winCOM calls.
So, given the way the winCOM bindings behave, maybe a PR could be made to allow direct indexing by name, and throw errors when trying to set two slides to the same name?
https://stackoverflow.com/questions/16855306/powerpoint-manually-set-slide-name shows how awkward it is to try to modify a slides name in Powerpoint or VBA it would be great if it would be possible to do it with the python-pptx library.
I tried modify slide.name and saving it but that didn't seem to change the slide name. When reading in the presentation again the slide name was still the old one. Am i missing something ?