mwilliamson / python-mammoth

Convert Word documents (.docx files) to HTML
BSD 2-Clause "Simplified" License
785 stars 121 forks source link

mammoth.convert_to_markdown(fileobj, **kwargs): doesn't convert tables #137

Closed junxu-ai closed 1 year ago

junxu-ai commented 1 year ago

It seems that the coversion to markdown is not fully impletmented. the code shows that it calls the html function.

Currently, i use mammoth to convert docx into html first, and then markdownify to convert the html to markdown.

If you're reporting a bug or requesting a feature, please include:

If you're reporting a bug, it's also useful to know what platform you're running on, including:

mwilliamson commented 1 year ago

Markdown support is deprecated, and converting first to HTML and then using something else to convert that HTML to Markdown (as you're already doing) is what's recommended.

junxu-ai commented 1 year ago

Thanks @mwilliamson.

i'm just wondering if the additional step would introduce more format errors.

eigen2017 commented 1 month ago

Markdown support is deprecated, and converting first to HTML and then using something else to convert that HTML to Markdown (as you're already doing) is what's recommended.

yes html2text would work