We are extracting metadata from pptx files, for few files metadata is extracting correctly and few files its failed to extract metadata for font name and size, we are getting None results. I have attached sample file, which is failed to extract metadata for font name and size.
from pptx import Presentation
import pptx
# I have stored the path of the string from which I want to read the text in the path variable
#path is a raw string, it is easy to store the path in raw strings
path="/Users/vcimalap/Downloads/Sanitized slide deck.pptx"
presents = Presentation(path)
#store_all_text will store all the strings whenever a new word will be encountered the list will be added with a new value
store_all_text = []
for slide in presents.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for paragraph in shape.text_frame.paragraphs:
for run in paragraph.runs:
print(run.font.name)
store_all_text.append(run.text)
print(store_all_text)
I have tested by converting pptx to xml, I see fonts names and size. I used below command to perform pptx to xml
We are extracting metadata from pptx files, for few files metadata is extracting correctly and few files its failed to extract metadata for font name and size, we are getting None results. I have attached sample file, which is failed to extract metadata for font name and size.
I have tested by converting pptx to xml, I see fonts names and size. I used below command to perform pptx to xml