Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
8.64k stars 705 forks source link

fix: run `libreoffice` once during wolfi image build #3183

Closed MthwRobinson closed 3 months ago

MthwRobinson commented 3 months ago

Summary

Closes #3105. Runs libreoffice once in the wolfi image to avoid the "cold start" problem where libreoffice fails on the first run. Reenables a test that was turned off due to this problem. Thanks @micmarty-deepsense for the solution!

Testing

Reenabled test should run in test_dockerfile.

MthwRobinson commented 3 months ago

Closing in favor of fixing upstream in https://github.com/Unstructured-IO/base-images/pull/22