mwilliamson / python-mammoth

Convert Word documents (.docx files) to HTML
BSD 2-Clause "Simplified" License
810 stars 121 forks source link

bug: program crash if docx file has complicate content format #121

Closed alexmaehon closed 2 years ago

alexmaehon commented 2 years ago

When I run the code below, the program crash without even printing out the except content I only wish it would not crash, even can not process the file, it will by pass the code

try: result = mammoth.convert_to_html(file_path) html = result.value # The generated HTML except: print('convertion failed')

running on Python 3.7.12 Ubuntu 18.04.5 LTS

test2.docx

VenkatsQuest commented 2 years ago

For me it worked fine , without any issues, of course on higher python version and on windows os alexmaehon

'3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]'

import mammoth file_path = "C:\Users\edaemnv\Downloads\test2.docx" result = mammoth.convert_to_html(file_path) print(result.value) output.txt

mwilliamson commented 2 years ago

Could you post the full output when you try to run the conversion?

alexmaehon commented 2 years ago

I check my code thoroughly this time, it appears is not mammoth's problem. Sorry for the trouble and thank you for this wonderful tool~