Closed arjun251 closed 1 year ago
From your code, it looks as though you're trying to convert a .doc
document rather than a .docx
document. If that's the case, then I'm afraid Mammoth only supports reading .docx
documents. If it is a .docx
document, please provide a minimal example so that the issue can be reproduced.
@mwilliamson Thanks for your response. I have another question. I have few tabular and image content in .docx file, when I converted it to .pdf I don't see the same structure as .docx in pdf.
.docx -> html -> pdf
Could you please help me with sample on working with Tabular and Image data.
style_map = """ p[style-name='Section Title'] => h1:fresh p[style-name='Subsection Title'] => h2:fresh """ docx_file = ".../test.docx" html = mammoth.convert_to_markdown(docx_file, style_map=style_map)
Thanks, Arjun S
Could you post a minimal example document, the HTML you're expecting, and the HTML you're currently getting?
Closing since the original issue has been addressed.
Could you please help me with this issue?
My code -
style_map = """ p[style-name='Section Title'] => h1:fresh p[style-name='Subsection Title'] => h2:fresh """ docx_file = ".../test.doc" html = mammoth.convert_to_markdown(docx_file, style_map=style_map)
Logs -
AttributeError Traceback (most recent call last)