dkmiller / tidbits

Short notes on stuff I have recently discovered / understood
1 stars 0 forks source link

Unstructured #188

Open dkmiller opened 6 months ago

dkmiller commented 6 months ago

Learned about it from r/Python.

https://unstructured-io.github.io/unstructured/introduction.html#getting-started

pip install "unstructured[docx]"

then

from unstructured.partition.auto import partition
elements = partition("path/to/a/document.docx")

print([e.text for e in elements])