Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
9.25k stars 767 forks source link

bump `unstructured-inference` #3711

Closed badGarnet closed 1 month ago

badGarnet commented 1 month ago

This PR bumps unstructured-inference to 0.8.0, which introduces vectorized data structure for layout elements and text regions. This PR also cleans up a few places in CI that has repeated definition of env variables or missing installation of testing dependencies in cache.

A few document ingest results are changed: