AI4WA / Docs2KG

Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models
https://docs2kg.ai4wa.com/
GNU Lesser General Public License v2.1
209 stars 26 forks source link

Support for docx ? #111

Open nicklhy opened 4 weeks ago

nicklhy commented 4 weeks ago

Is your feature request related to a problem? Please describe. docx is a common document file type and it's much easier to parse than pdf (even for the most basic text extraction). Hope someone can add a parser for docx files

Describe the solution you'd like Use python-docx.

PascalSun commented 3 weeks ago

we will also add that in, we currently do a upgrade with the architecture.