Open wzp123123 opened 7 months ago
As a workaround, i can extract images via zipfile from "ppt/media". ref: https://github.com/madyel/extract_media_ppt
That, in fact, might be a faster way of doing it. But you lose the context of which slide they came from - which you might not care about.
That, in fact, might be a faster way of doing it. But you lose the context of which slide they came from - which you might not care about.
I meet the above error when i extract 'jpeg' image😂 directly.
I use the below code to extract images from pptx:
code
if shape.shape_type == MSO_SHAPE_TYPE.PICTURE: image_bytes = shape.image.blob
for some images, raise: File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/pptx/shapes/picture.py:195, in Picture.image(self) 193 if rId is None: 194 raise ValueError("no embedded image") --> 195 return slide_part.get_image(rId)
File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/pptx/parts/slide.py:30, in BaseSlidePart.get_image(self, rId) 24 def get_image(self, rId): 25 """ 26 Return an |Image| object containing the image related to this slide 27 by rId. Raises |KeyError| if no image is related by that id, which 28 would generally indicate a corrupted .pptx file. 29 """ ---> 30 return self.related_part(rId).image
AttributeError: 'Part' object has no attribute 'image'
I find all 'png' images extract successfully, all 'jpeg' images failed.