Open viniciusdc opened 2 months ago
I think we could add a consistent version in tf_objects.py instead of in the terraform files directly. That would enforce the same version in all of the stages.
I'm also curious what issues you've seen from this. It seems like it shouldn't cause a problem to use different provider versions in different stages.
I'm also curious what issues you've seen from this. It seems like it shouldn't cause a problem to use different provider versions in different stages.
I haven't noted any issue directly, but keeping this inconsistency might open the chance for bugs where tracking would be difficult; for example, a specific version of the provider might handle certain API request in a particular order while another newer version does not (same with error messages) or have different internal requirements like region, zones etc..
It seems like b/c the stages are isolated from each other (isolated terraform modules) that differering provider versions should be okay. That said, I think we should try to keep the versions consistent between the stages, but I'm not sure I would support enforcing it (e.g. for plugins), at least not until we see an issue.
@smokestacklightnin will be picking up this issue
@smokestacklightnin, here's a bit more context to help bring you up to speed on this issue.
Nebari has several Terraform stages, which are run sequentially because some require the output of others as input. For each stage, we have one or multiple versions.tf
file where Terraform provider versions are defined. For example: https://github.com/nebari-dev/nebari/blob/9b1310b33e89c2c11c3b39128ec792ca80342486/src/_nebari/stages/infrastructure/template/aws/versions.tf
It also seems we are setting providers in other files, like for example: https://github.com/nebari-dev/nebari/blob/9b1310b33e89c2c11c3b39128ec792ca80342486/src/_nebari/stages/terraform_state/template/aws/main.tf#L23-L31
We need to ensure consistency across all stages by using the same provider versions. For now, I suggest we avoid updating to the latest available versions and instead stick to the most up-to-date versions among the ones we’re currently using.
Describe the bug
We must be more consistent in Terraform provider versions across different deployment stages. This discrepancy can lead to unpredictable behavior and potential issues during deployment. For example, on a recent AWS deployment, I noticed the following in deployment logs from Terraform:
Stage 01 -- Terraform State:
Stage 02:
Stage 03:
While we do set the version for the most important infrastructure resources: https://github.com/nebari-dev/nebari/blob/a65ff53df9c7cdfa4bf1b99b9099f7d5efa1240d/src/_nebari/stages/infrastructure/template/aws/versions.tf#L1-L9
The order stages use the
terraform.Provider
to instantiate the providers across the deployment: https://github.com/nebari-dev/nebari/blob/a65ff53df9c7cdfa4bf1b99b9099f7d5efa1240d/src/_nebari/stages/terraform_state/__init__.py#L181-L186We should make sure that it becomes consistent. Also, the exciting thing is that after stage 3, it becomes consistent across all calls; I guess it comes from the backend being already set up.
Expected behavior
At least the cloud provider versions respect the versions described in their infra modules, as that would be expected.
OS and architecture in which you are running Nebari
Linux
How to Reproduce the problem?
Any cloud provider deployment might lead to the same problem.
Command output
No response
Versions and dependencies used.
No response
Compute environment
AWS
Integrations
No response
Anything else?
No response