CenterForOpenScience / pydocx

An extendable docx file format parser and converter
Other
183 stars 55 forks source link

Problem converting .docx files with images to HTML #237

Open wyler0 opened 7 years ago

wyler0 commented 7 years ago

Hi pydocx team, Recently I was attempting to add images to a docx file via python-docx then convert the document to html on Python 3.6.0 with PyDocX version 0.9.10. I am able to convert documents without images to HTML without any issue.

The problem arises if I add an image to the document using python-docx, save the document, then use print(PyDocX.to_html(directory+'hey.docx') to print and convert the file to HTML. After executing the print command the program freezes and I must use control+c to escape.

I then tried to run the .to_html on a .docx file with just a single image and no other elements and the same behavior occurred.

From my testing it seems your function has some sort of error in it, however I may be wrong. If you need any more information please let me know.

botzill commented 7 years ago

Hi @wyler0, can you upload a version of the .docx file you are testing with ?

wyler0 commented 7 years ago

SPORTS-Rye High Boys' Tennis.docx