Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
https://www.unstructured.io/
Apache License 2.0
7.8k stars 626 forks source link

build: bump amd64 image to python 3.12 #3083

Closed MthwRobinson closed 1 month ago

MthwRobinson commented 2 months ago

Summary

Closes #3051. Updates the AMD64 docker image to use Python 3.12 instead of Python 3.11. This is important for making sure the AMD64 images based on Chainguard continue to build in future, because the Chainguard latest images are updated frequently and could drop Python 3.11 support in the future.

This PR swaps from wolfi-base to python:latest-dev because building on wolfi-base could not install pycocotools with Python 3.12, though it worked for Python 3.11. We could likely explore slimming down the image by building on wolfi-base with a similar set of system dependencies from python:latest-dev.

As part of this PR, we mount the test directories instead of copying them into the Dockerfile.

Testing

MthwRobinson commented 1 month ago

Going to close this for, but can pick back up later if wolfi-base drops Python 3.11 support