Closed arththebird closed 4 months ago
I'm wondering why the CI pipeline didn't run on this PR? I'm reasonably sure the fix is fine but I'd feel a lot better if the CI jobs had run.
Hi @bobh66, we have an issue with the multiple runner label selection from yesterday in reusable workflows. It should be fixed after the merge of https://github.com/upbound/official-providers-ci/pull/209
@arththebird thanks a ton for this contribution, just amazing! 🥇
@arththebird thanks a ton for this contribution, just amazing! 🥇
My pleasure, thank you for the fast review and merge 🙌
Not sure how the release cycle for this provider is setup, I'm wondering when can we expect this to be released officially?
@arththebird I had an internal chat with @turkenf, we target the release for this Thursday 👍
Installed the latest release. provider-terraform is still deleting the workspace and re-initing it on every reconciliation. I don't have any directories under the "/tf" inside the provider pod with the modification date older than 1 hour.
@project-administrator do the provider logs indicate that terraform init is being run every reconciliation? Is it possible that go-getter is updating the timestamp of the directory on every reconciliation loop?
@bobh66 checked that and verified that checksums never match:
2024-07-11T18:11:14.815Z DEBUG provider-terraform Checksums don't match so run terraform init: {"request": "demo-dev", "old": "ccbd114f2f61a213c57f3fc023222b3d", "new": "4812c08f3f95a371c9c2e5a6cdbaf9e1"}
So now it's the checksum calculation taking smth into account that is not there.
I wonder if it might be either the crossplane-provider-config.tf
which is generated by the Crossplane's providerconfigs.tf.upbound.io resource, or few files which are downloaded by the terraform run?..
I can see that GenerateChecksum uses the following command:
/usr/bin/find . -path ./.git -prune -o -path ./.terraform -prune -o -type f -exec /usr/bin/md5sum {} +
With our config it always picks up more files than needed:
19e9bd65283e39a75bd15d532eeb4a3a ./.git-credentials
c115cb3d433d8fc4304272d851cc4694 ./crossplane-provider-config.tf
d1b0e16cfb7f8fa5004dc6494c0bbc73 ./.terraform/terraform.tfstate
60715e2ecfcf0d2885edd52f0a506cf5 ./.terraform/environment
9a11bb6c5701d7300a3c6ed8c33dc6dd ./.terraform.lock.hcl
0507597169bd025f95803b0d1713f943 ./rds-combined-ca-bundle.pem
84a4bf3e807ebcd712ed17a7a9fd576b ./rds-combined-ca-bundle.crt
None of these are present in the remote git repo...
I don't have any directories under the "/tf" inside the provider pod with the modification date older than 1 hour.
I see the same thing, but I think @bobh66 theory about go-getter updating the timestamp makes sense because if I check the .terraform
directory timestamp (which is not in the remote git repo), it doesn't get updated. My understanding is that deleting this folder is what was causing the checksum diff on every loop in the previous version.
These files should be ignored when checksum is calculated. But they are being taken into consideration and this breaks the checksum calculation:
19e9bd65283e39a75bd15d532eeb4a3a ./.git-credentials
c115cb3d433d8fc4304272d851cc4694 ./crossplane-provider-config.tf
d1b0e16cfb7f8fa5004dc6494c0bbc73 ./.terraform/terraform.tfstate
60715e2ecfcf0d2885edd52f0a506cf5 ./.terraform/environment
9a11bb6c5701d7300a3c6ed8c33dc6dd ./.terraform.lock.hcl
Also, does it make sense to calculate the checksum of the URL address only for the remote repository module source? Why do we need to fetch files each time to calculate the checksum?
I mean, if the URL has not changed, then it must be the same TF workspace source, therefore there is no need to run terraform init
if it has been run once before.
Also, does it make sense to calculate the checksum of the URL address only for the remote repository module source? Why do we need to fetch files each time to calculate the checksum? I mean, if the URL has not changed, then it must be the same TF workspace source, therefore there is no need to run
terraform init
if it has been run once before.
We need to pull the content of the repo (using git fetch/pull) to determine if the content of the repo has changed. We can't assume that the repo is static.
We need to pull the content of the repo (using git fetch/pull) to determine if the content of the repo has changed. We can't assume that the repo is static.
Do you think it's a good idea to introduce an optional parameter to control how the checksum is calculated? (repo URL vs repo contents). For example, we're always using the remote repo git URL with a tag. So there can't be any change unless the tag is updated, and that would be a different URL (as the tag is a part of the URL)
@project-administrator it's an interesting point - with the existing implementation if the remote URL changes it will now git pull
the new repo into the existing repo directory, having removed only the .git
directory. We would need to store the current remote URL somewhere (it is in .git/config
) and compare it with the current spec.forProvider.module
. If they are different then the entire directory should be removed so a clean git pull
can occur.
Since we don't require the use of a ref
it's entirely possible for people to use a repo that has changes being pushed into it. We have no way of knowing that happens so we need to run the equivalent of git fetch
and git pull
on every reconciliation.
I wonder if a source of RemoteOnce
or something similar would be appropriate - it says to pull the remote content one time and never pull it again (assuming that it exists in the container). The existing Remote
value is the equivalent of RemoteAlways
- it's basically a pullPolicy
for the remote repo.
Though that still doesn't help with the remote URL changing - that we would still need to check for explicitly.
@ytsarev - any thoughts?
@bobh66 I like the idea. We can create something like remoteSourcePullPolicy
symmetric to the well-known https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy
Thanks @ytsarev - IfNotPresent
and Always
seem to make the most sense. I'm not sure that Never
applies to our use case. Are there any other possible scenarios? I'm wondering what happens if the module
changes when source
is Remote
but the pull policy is IfNotPresent
- I assume we automatically assume the new module
content is not present and pull it?
Also should we save the current Remote module
value in status
just to make it easier to compare with the current spec.module
so we know if it changed without needing to read .git/config
? It seems like that might be a more robust solution.
Description of your changes
Stop removing the Workspace directory on every reconciliation. This is no longer needed as the bug in
go-getter
was fixed in v1.7.5 andgo-getter
was upgraded to latest by Renovate in this PR.Not removing the workspace directory also results in
terraform init
not being run every time when there's no change because the checksum is the same. This effectively allow usingmax-reconcile-rate > 0
while using the plugin cache (issues can still happen if a workspace is applying while another one is being init, but it happens much less frequently than before when there's no changes).Fixes #275 Fixes #230
I have:
make reviewable
to ensure this PR is ready for review.How has this code been tested
make local-deploy
.terraform
directory in the workspace directory and notice that it doesn't get deleted on every reconciliation