scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.28k stars 502 forks source link

Failed to fetch Font name and size #840

Open vidyasagarreddyc opened 1 year ago

vidyasagarreddyc commented 1 year ago

We are extracting metadata from pptx files, for few files metadata is extracting correctly and few files its failed to extract metadata for font name and size, we are getting None results. I have attached sample file, which is failed to extract metadata for font name and size.

from pptx import Presentation
import pptx

# I have stored the path of the string from which I want to read the text in the path variable

#path is a raw string, it is easy to store the path in raw strings

path="/Users/vcimalap/Downloads/Sanitized slide deck.pptx"

presents = Presentation(path)

#store_all_text will store all the strings whenever a new word will be encountered the list will be added with a new value

store_all_text = []

for slide in presents.slides:

    for shape in slide.shapes:

        if not shape.has_text_frame:

            continue

        for paragraph in shape.text_frame.paragraphs:

            for run in paragraph.runs:
                print(run.font.name)
                store_all_text.append(run.text)

print(store_all_text)

I have tested by converting pptx to xml, I see fonts names and size. I used below command to perform pptx to xml

 pip install opc-diag
 opc extract PPTXFILE DIRECTORY
vidyasagarreddyc commented 1 year ago

Sanitized slide deck.pptx

vidyasagarreddyc commented 1 year ago

Anyone can help here?