scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.4k stars 519 forks source link

Feat: Add Alt text properties and setters #911

Open pa-t opened 1 year ago

pa-t commented 1 year ago

Adds ability to read and set alt text. Wanted to be able to leverage this library to add in alt text for screen readers for the visually impaired

Property methods taken from @cray2015 pr: https://github.com/scanny/python-pptx/pull/512

pa-t commented 1 year ago

@scanny I tried the snippet you added in the comments of the #512 PR:

alt_text = shape._element._nvXxPr.cNvPr.attrib.get("desc", "")

And it did not work for me, while this implementation did:

>>> from pptx import Presentation
>>> pr = Presentation("/Users/pat/Downloads/Test Slide Deck.pptx")
>>> pr.slides[2].shapes[1]._element._nvXxPr.cNvPr.attrib.get("desc", "")
''
>>> pr.slides[2].shapes[1].alt_text
'This is better alt text'
scanny commented 1 year ago

Yes, quite right, I left off the "r" on descr, I'll get that fixed up :)

pa-t commented 1 year ago

Ah ok, i see that working now. Is there any reason behind not wanting to have this be officially supported? I think it would be a nice addition. Are there some edge cases you are worried about?

scanny commented 1 year ago

It would be great for it to be added to the package, it's just not likely for that to happen this week and I'm not sure when we'll get to it. New features need documentation, tests, and a version deployment to ride along with. Just "working" is not even close to complete enough. So there's more effort here than may be immediately apparent. A local function works today and is vendored in your own code-base so you don't need to wait for someone else to add it or fix or otherwise adjust it later.

pa-t commented 1 year ago

Thats fine, I just wasn't sure what you needed for ACs. I just pushed a test for the getting and setting of the alt text field. Not sure where best to add documentation but if you point i will write it. Thanks @scanny

scanny commented 1 year ago

@pa-t This looks pretty good and I think I'm going to be in here soon adding some other things so I can probably bundle it in; I've made it a shortlist item.

On the docs, the key thing is the analysis document, which you can think of as a PPEP (python-pptx enhancement proposal). Here one of the ones we have so far: https://python-pptx.readthedocs.io/en/latest/dev/analysis/shp-shapes.html.

In this particular case, I think you would just be adding some things to this existing page since this is really a small extension to available shape properties. But the key things are:

All that gives us enough to determine whether we've got the design right. So generally this would be the first commit and we can discuss and refine it before investing in an implementation. This analysis is separately mergeable once resolved. You're welcome to make an initial implemention just for exploratory purposes, in this case that's a great idea. It's just that we might change our mind about aspects of the design or behavior so you wouldn't want to put in more work than you'd be willing to do over if the approach changes, as it often does.

The user documentation is derived from the docstrings, so that comes later and is usually pretty automatic.

scanny commented 1 year ago

Say @pa-t, on the documentation bit, not sure we'd need a lot, but can you help me understand when alt-text is used and what it's for? A couple questions:

And anything else you can think of. I'm trying to place it in a user context to understand how the feature would be used, when it would be relevant, etc.

pa-t commented 1 year ago

hey @scanny

I think the best way to frame this feature is through the lens of enabling more accessibility support. Screen readers are the main use case, only other use i can think of is in the case of remote resources not loading and defaulting to alt-text

Do you want me to write the docs for this?

scanny commented 1 year ago

I'd be thinking we want to resolve to a short blurb that goes in the docstring and leave it at that. Probably no more than 50 words and maybe less, but as long as it needs to be to set folks' expectations. I think starting big and then whittling down to a concise summary (like we're doing) is the right approach.

The thing I don't see yet is the default behavior. Does this get anything by default? Otherwise I'm thinking it going to be empty 99.99% of the time. If PowerPoint automatically provides default values when (at least certain) shapes are created, what are those? Like is it the filename of the image in the case of a Picture for example? Or would the user expect these to be blank unless they happened to know the author had explicitly set them?

scanny commented 1 year ago

Also, what is this attribute called in the MS API? A search on "powerpoint vba get alt text" is a good place to start.

pa-t commented 1 year ago

No nothing is set by default, it is just left empty. The user has to set this value. Found this doc page where it is just called AlternativeText

pa-t commented 1 year ago

@scanny how does something like this read:

The alt_text property allows for enhanced accessibility support in presentations by enabling the addition of alternative text to shapes. This feature primarily benefits screen readers, providing a description when the shape cannot be visually interpreted. By default, this property is empty and must be manually set by the user.