DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
5 stars 2 forks source link

Local .terraform directories consume excessive disk space #6339

Open hannes-ucsc opened 4 weeks ago

hannes-ucsc commented 4 weeks ago

To reproduce, run the following command from your local Azul repository, within the deployments directory:

$ find . -type f -name terraform-provider* -maxdepth 5 -exec du -h {} \;
 54M    ./prod/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-google_v2.20.3_x4
 22M    ./prod/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./prod/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
272M    ./prod/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-aws_v4.3.0_x5
 70M    ./prod/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./prod/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./prod/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
272M    ./prod/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-aws_v4.3.0_x5
 70M    ./anvilbox/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./anvilbox/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./anvilbox/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
299M    ./anvilbox/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-aws_v4.30.0_x5
 14M    ./anvildev.gitlab/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-external_v2.2.0_x5
 70M    ./anvildev.gitlab/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./anvildev.gitlab/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./anvildev.gitlab/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
299M    ./anvildev.gitlab/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-aws_v4.30.0_x5
 70M    ./dev.shared/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./dev.shared/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./dev.shared/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
272M    ./dev.shared/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-aws_v4.3.0_x5
 14M    ./prod.shared/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-external_v2.2.0_x5
 70M    ./prod.shared/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./prod.shared/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./prod.shared/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
299M    ./prod.shared/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-aws_v4.30.0_x5
 70M    ./anvildev/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./anvildev/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./anvildev/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
299M    ./anvildev/.terraform.platform-anvil-dev/plugins/darwin_amd64/terraform-provider-aws_v4.30.0_x5
 14M    ./sandbox/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-external_v2.2.0_x5
 70M    ./sandbox/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./sandbox/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./sandbox/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
299M    ./sandbox/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-aws_v4.30.0_x5
 70M    ./dev/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./dev/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./dev/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
272M    ./dev/.terraform.platform-sc/plugins/darwin_amd64/terraform-provider-aws_v4.3.0_x5
 14M    ./prod.gitlab/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-external_v2.2.0_x5
 70M    ./prod.gitlab/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-google_v3.90.1_x5
 22M    ./prod.gitlab/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4
 21M    ./prod.gitlab/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-template_v2.2.0_x4
299M    ./prod.gitlab/.terraform.platform-hca-prod/plugins/darwin_amd64/terraform-provider-aws_v4.30.0_x5

This list adds up to 4052MB.

In some instances when different provider versions have been used, the managed deployment includes a provider for each of the versions used (or that it may have used in the past).

achave11-ucsc commented 4 weeks ago

Assignee to provide reproduction, ideally including evidence for the amount of duplication in each directory.

achave11-ucsc commented 3 weeks ago

Assignee to consider next steps.

hannes-ucsc commented 11 hours ago

We should set TF_PLUGIN_CACHE_DIR to ${project_root}/.terraform/cache/plugins. Note that the directory has to exist for Terraform to use it, so the commit must include a regular, empty file at ${project_root}/.terraform/cache/plugins/.gitkeep. Running make clean from the project root should not delete the contents of ${project_root}/.terraform/cache/plugins. The PR author should be able to achieve this simply by mentioning that directory in the top-level .gitignore file, in the right place.

For developers with more than one clone of Azul, this solution would still waste space by keeping redundant copies of plugins but I like the idea of complete separation between Azul clones. In a perfect world all clones could share the TF plugins but bugs in TF's plugin cache handling could interfere with, say, the ability to concurrently run make deploy from two different clones. This is consistent with how we manage Python virtual envs, which are also kept within the project root, though, at this time, we won't provide a Makefile target to manage the TF plugin cache.