scanny / python-pptx

Create Open XML PowerPoint documents in Python
MIT License
2.26k stars 499 forks source link

PPTX that used add_video which added a wav file (mime_type="audio/x-wav" cant' add again? #926

Open luvwinnie opened 7 months ago

luvwinnie commented 7 months ago

I have a pptx which used the add_video function to added a audio file.

I delete the audio file with normal Powerpoint apps, and then If I try to add an video again to the same slide, It shows the following errors.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[156], line 21
     17 output_path = f"pdf2pptx_1699845973_30650/{index}.wav"
     20 # Add the audio file as a movie
---> 21 audio = slide.shapes.add_movie(
     22     output_path,
     23     left=Inches(0.5),  # 0.5 inches from the left
     24     top=Inches(0.5),   # 0.5 inches from the top
     25     width=Inches(1),   # 1 inch wide
     26     height=Inches(1),  # 1 inch tall
     27     poster_frame_image=None,
     28     mime_type='audio/x-wav'
     29 )
     30 # break
     31 
     32 # Get the shape id of the audio shape
     33 r_id = audio._element.xpath(f'//p:nvPicPr/p:cNvPr[@name="{index}.wav"]')[0].get('id')

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/shapes/shapetree.py:543, in SlideShapes.add_movie(self, movie_file, left, top, width, height, poster_frame_image, mime_type)
    514 def add_movie(
    515     self,
    516     movie_file,
   (...)
    522     mime_type=CT.VIDEO,
    523 ):
    524     """Return newly added movie shape displaying video in *movie_file*.
    525 
    526     **EXPERIMENTAL.** This method has important limitations:
   (...)
    541     a placeholder for the video.
    542     """
--> 543     movie_pic = _MoviePicElementCreator.new_movie_pic(
    544         self,
    545         self._next_shape_id,
    546         movie_file,
    547         left,
    548         top,
    549         width,
    550         height,
    551         poster_frame_image,
    552         mime_type,
    553     )
    554     self._spTree.append(movie_pic)
    555     self._add_video_timing(movie_pic)

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/shapes/shapetree.py:920, in _MoviePicElementCreator.new_movie_pic(cls, shapes, shape_id, movie_file, x, y, cx, cy, poster_frame_image, mime_type)
    910 @classmethod
    911 def new_movie_pic(
    912     cls, shapes, shape_id, movie_file, x, y, cx, cy, poster_frame_image, mime_type
    913 ):
    914     """Return a new `p:pic` element containing video in *movie_file*.
    915 
    916     If *mime_type* is None, 'video/unknown' is used. If
    917     *poster_frame_file* is None, the default "media loudspeaker" image is
    918     used.
    919     """
--> 920     return cls(
    921         shapes, shape_id, movie_file, x, y, cx, cy, poster_frame_image, mime_type
    922     )._pic
    923     return

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/util.py:215, in lazyproperty.__get__(self, obj, type)
    210 value = obj.__dict__.get(self.__name__)
    211 if value is None:
    212     # ---on first access, __dict__ item will absent. Evaluate fget()
    213     # ---and store that value in the (otherwise unused) host-object
    214     # ---__dict__ value of same name ('fget' nominally)
--> 215     value = self._fget(obj)
    216     obj.__dict__[self.__name__] = value
    217 return value

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/shapes/shapetree.py:940, in _MoviePicElementCreator._pic(self)
    934 @lazyproperty
    935 def _pic(self):
    936     """Return the new `p:pic` element referencing the video."""
    937     return CT_Picture.new_video_pic(
    938         self._shape_id,
    939         self._shape_name,
--> 940         self._video_rId,
    941         self._media_rId,
    942         self._poster_frame_rId,
    943         self._x,
    944         self._y,
    945         self._cx,
    946         self._cy,
    947     )

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/shapes/shapetree.py:1008, in _MoviePicElementCreator._video_rId(self)
   1001 @property
   1002 def _video_rId(self):
   1003     """Return the rId of RT.VIDEO relationship to video part.
   1004 
   1005     For historical reasons, there are two relationships to the same part;
   1006     one is the video rId and the other is the media rId.
   1007     """
-> 1008     return self._video_part_rIds[1]

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/util.py:215, in lazyproperty.__get__(self, obj, type)
    210 value = obj.__dict__.get(self.__name__)
    211 if value is None:
    212     # ---on first access, __dict__ item will absent. Evaluate fget()
    213     # ---and store that value in the (otherwise unused) host-object
    214     # ---__dict__ value of same name ('fget' nominally)
--> 215     value = self._fget(obj)
    216     obj.__dict__[self.__name__] = value
    217 return value

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/shapes/shapetree.py:998, in _MoviePicElementCreator._video_part_rIds(self)
    991 @lazyproperty
    992 def _video_part_rIds(self):
    993     """Return the rIds for relationships to media part for video.
    994 
    995     This is where the media part and its relationships to the slide are
    996     actually created.
    997     """
--> 998     media_rId, video_rId = self._slide_part.get_or_add_video_media_part(self._video)
    999     return media_rId, video_rId

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/parts/slide.py:194, in SlidePart.get_or_add_video_media_part(self, video)
    184 def get_or_add_video_media_part(self, video):
    185     """Return rIds for media and video relationships to media part.
    186 
    187     A new |MediaPart| object is created if it does not already exist
   (...)
    192     PowerPoint media embedding strategy.
    193     """
--> 194     media_part = self._package.get_or_add_media_part(video)
    195     media_rId = self.relate_to(media_part, RT.MEDIA)
    196     video_rId = self.relate_to(media_part, RT.VIDEO)

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/package.py:44, in Package.get_or_add_media_part(self, media)
     38 def get_or_add_media_part(self, media):
     39     """Return a |MediaPart| object containing the media in *media*.
     40 
     41     If a media part for this media bytestream ("file") is already present
     42     in this package, it is reused, otherwise a new one is created.
     43     """
---> 44     return self._media_parts.get_or_add_media_part(media)

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/package.py:207, in _MediaParts.get_or_add_media_part(self, media)
    200 def get_or_add_media_part(self, media):
    201     """Return a |MediaPart| object containing the media in *media*.
    202 
    203     If this package already contains a media part for the same
    204     bytestream, that instance is returned, otherwise a new media part is
    205     created.
    206     """
--> 207     media_part = self._find_by_sha1(media.sha1)
    208     if media_part is None:
    209         media_part = MediaPart.new(self._package, media)

File ~/personal_assistant/venv/lib/python3.8/site-packages/pptx/package.py:220, in _MediaParts._find_by_sha1(self, sha1)
    213 """Return |MediaPart| object having *sha1* hash or None if not found.
    214 
    215 All media parts belonging to this package are considered. A media
    216 part is identified by the SHA1 hash digest of its bytestream
    217 ("file").
    218 """
    219 for media_part in self:
--> 220     if media_part.sha1 == sha1:
    221         return media_part
    222 return None

AttributeError: 'Part' object has no attribute 'sha1'

Which part should I check for it?

scanny commented 7 months ago

Hmm, looks like a part (file in the .pptx zip archive) is not getting instantiated as a MediaPart (which is what has the .sha1 property) and is defaulting back to the generic Part class.

This is where the Part subclass is determined: https://github.com/scanny/python-pptx/blob/master/pptx/__init__.py#L51-L60

This is where membership in _MediaParts is determined: https://github.com/scanny/python-pptx/blob/master/pptx/__init__.py#L51-L60

You can see those determine what counts as a media-part differently.

So I think there's a type that needs to be added to the first list above. Also, in retrospect, it would probably be worth thinking through how to use the same single mechanism for determining what's a media part and what isn't.

luvwinnie commented 7 months ago

@scanny Thank for replying. I'm adding the audio wav file by using the add_movie method. From your reference, it seems like it works on like mp4, mov and etc. of the movie file.

By adding the audio with mime_type="audio/x-wav" would cause this errors maybe?

scanny commented 7 months ago

Yep, could be. Try adding that type to the first file and see if that fixes it. Interestingly there is no audio/x-wav entry in pptx.opc.constants.CONTENT_TYPES, so you might want to add it there.

I think root cause here is that audio was not considered when adding that feature so we didn't add audio content types or test cases at the time, but I expect much of the implementation is the same between audio and video, not sure about player behavior etc.