ncbi / docker

Other
48 stars 32 forks source link

Why does the container have google-cloud-sdk in it? #23

Closed intendo closed 3 months ago

intendo commented 2 years ago

I can't see why the google-cloud-sdk is in the container. It is over half the size of the image. The google-cloud-sdk is over 880 Megabytes.

azat-badretdin commented 2 years ago

Thank you for your report, Darren! Would you mind illustrating that? Thanks!

intendo commented 2 years ago

Add the following to the main Dockerfile to remove 880 Megabytes of disk space: RUN rm -rf /usr/share/google-cloud-sdk /usr/lib/google-cloud-sdk

azat-badretdin commented 2 years ago

Thank you, Darren.

We will investigate why do we have this in our image.

christiam commented 3 months ago

I can't see why the google-cloud-sdk is in the container.

gsutil is needed in when downloading BLAST databases from GCP whenever Requester Pays is enabled on a bucket. In BLAST+ 2.16.0 we have made efforts to reduce the size of the ncbi/blast docker image; this led to about a 50% reduction in size: https://hub.docker.com/r/ncbi/blast/tags

The google-cloud-sdk is over 880 Megabytes.

This is a known issue: https://github.com/GoogleCloudPlatform/gsutil/issues/1732