python-openxml / python-docx

Create and modify Word documents with Python
MIT License
4.52k stars 1.11k forks source link

how to correctly delete an image with its relationships? #1425

Open Leonas2000 opened 3 weeks ago

Leonas2000 commented 3 weeks ago

As my title says how can I correctly delete/replace an image (in my case its in a table in the header)? I can find the inline tag inside the run. what is the way to identify its relations and remove everything correctly? Thanks!

scanny commented 3 weeks ago

This is not a feature directly supported in the API, but there are a couple approaches you could take:

  1. Leave the image in place, just change the bytes of the image. This SO answer describes that approach: https://stackoverflow.com/a/68248657/1902513
  2. Delete the element containing the image reference and add a new one

A couple complications you might encounter:

  1. Replace image
    • Not sure everything will work properly if you replace say a PNG with a JPG, so might want to experiment if you're changing the image format.
    • Might need to adjust size/aspect-ratio afterward depending
  2. Delete and replace
    • I think this should just work, you just need to make sure you're deleting the right "container" element. Deleting the whole paragraph is probably not a bad idea, or at least the run. Regular lxml methods should work for that.
    • The relationship and old image part may not go away even though it's orphaned, so if you care about that you may need to dig deeper to identify the relationship too and delete it.