Closed ChanghaoLau closed 3 months ago
I am going to leave this issue open for a bit and thing about how this might be seamlessly accomplished. Until then, here’s a script that will identify tables for you.
As of Docx2Python v 3.0.0, tables are guaranteed to be nxm (n rows by m columns) and are straightforward to identify. See details near the top of the README file. I've also left an example of exporting tables as markdown in the tests folder. It's referenced in the README.
I want to extract the table in .docx file into markdown format, while maintaining the position of the table in the document. So I can't use
python-docx
document.paragraghs
anddocument.tables
to handle paragraghs and tables separately (this will destory the positional relationship between them).docx2python
is very easy to use. I would like to know whetherdocx2python
can save tables in markdown format, or whether it can separate tables, images and paragraphs inoutput.body
. Thank you!