databricks / containers

Sample base images for Databricks Container Services
Apache License 2.0
167 stars 118 forks source link

Purge the pip cache after installing torch #143

Closed mdagost closed 1 year ago

mdagost commented 1 year ago

The pytorch GPU container is quite large, and building off of it is causing timeouts on Databricks. See #142 . Purging the pip cache after torch install seems to save about 2 GB on the image size. Looks like this is done here in the venv image but not done after the torch installs. This PR fixes that.

panchalhp-db commented 1 year ago

@mdagost the docker images on dockerhub have been updated with this change: https://hub.docker.com/layers/databricksruntime/gpu-pytorch/cuda11.8/images/sha256-4fd825414b78dc352602f427537cae086126ad1cf47edcf793ef196d2b958ff2 and the image size is down to 4.87 Gb. Thank you again for discovering the issue and helping fix it as well!