crossplane-contrib / provider-upjet-aws

Official AWS Provider for Crossplane by Upbound.
https://marketplace.upbound.io/providers/upbound/provider-aws
Apache License 2.0
142 stars 121 forks source link

[Bug]: The old family Provider and ProviderRevision left when manually installed #1452

Open pierluigilenoci opened 1 month ago

pierluigilenoci commented 1 month ago

Is there an existing issue for this?

Affected Resource(s)

No resources affected

Resource MRs required to reproduce the bug

No resources needed

Steps to Reproduce

To reproduce the problem:

  1. Install the providers WITHOUT family provider
  2. Once the automatic family provider appears, try to install the family provider manually
  3. Find a way to make the automatic family Provider and ProviderRevision manifests disappear

What happened?

I expect the automatically installed version will disappear when the family provider is installed manually.

Of course, everything works on a clean cluster. I am looking for a way to switch from automatic to manual family provider without reinstalling Crossplane in a cluster where there are already MRs used in production.

This is a follow-up bug concerning #1088

More details in the Slack discussion: https://crossplane.slack.com/archives/C05E0UE46S2/p1722852504359609

Relevant Error Output Snippet

kubectl get providers.pkg.crossplane.io
NAME                          INSTALLED   HEALTHY   PACKAGE                                                          AGE
provider-aws-cloudwatchlogs   True        True      xpkg.upbound.io/upbound/provider-aws-cloudwatchlogs:v1.6.0       283d
provider-aws-dynamodb         True        True      xpkg.upbound.io/upbound/provider-aws-dynamodb:v1.6.0             109d
provider-aws-ec2              True        True      xpkg.upbound.io/upbound/provider-aws-ec2:v1.6.0                  389d
provider-aws-elasticache      True        True      xpkg.upbound.io/upbound/provider-aws-elasticache:v1.6.0          389d
provider-aws-iam              True        True      xpkg.upbound.io/upbound/provider-aws-iam:v1.6.0                  389d
provider-aws-mq               True        True      xpkg.upbound.io/upbound/provider-aws-mq:v1.6.0                   389d
provider-aws-rds              True        True      xpkg.upbound.io/upbound/provider-aws-rds:v1.6.0                  389d
provider-aws-s3               True        True      xpkg.upbound.io/upbound/provider-aws-s3:v1.6.0                   389d
provider-family-aws           True        True      xpkg.upbound.io/upbound/provider-family-aws:v1.6.0               5d1h
provider-kubernetes           True        True      xpkg.upbound.io/crossplane-contrib/provider-kubernetes:v0.14.0   301d
provider-sql                  True        True      xpkg.upbound.io/crossplane-contrib/provider-sql:v0.9.0           287d
provider-terraform            True        True      xpkg.upbound.io/upbound/provider-terraform:v0.16.0               389d
upbound-provider-family-aws   True        False     xpkg.upbound.io/upbound/provider-family-aws:v1.10.0              3d20h


### Crossplane Version

v1.16.0-up.1

### Provider Version

1.6.0

### Kubernetes Version

v1.27.15

### Kubernetes Distribution

EKS

### Additional Info

This is a follow-up bug concerning #1088
haarchri commented 1 month ago

did you tried the following ? this is based on https://docs.upbound.io/providers/migration/#migrating-from-monolithic-to-family-official-providers which is the same at the end of the day

please try it in a test cluster first ;)

Set Revision Activation Policy to Manual:

apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
  name: manual-provider-family-aws
spec:
  package: xpkg.upbound.io/upbound/provider-family-aws:v1.10.0
  revisionActivationPolicy: Manual

Verify Provider Installation and Health Status:

Confirm that the "manual family provider" is INSTALLED: False and HEALTHY: True. Run the following command to check the status:

kubectl get providers
NAME                                INSTALLED   HEALTHY   PACKAGE                                                 AGE
manual-provider-family-aws  False       True      xpkg.upbound.io/upbound/provider-family-aws:v1.10.0

Delete the Automatic Family Provider: kubectl delete provider.pkg upbound-provider-family-aws

After removing the automatic provider, update the revisionActivationPolicy for the manual-provider-family-aws from Manual to Automatic. This change will allow the provider to automatically manage its resources as needed.

darioef commented 1 month ago

Same problem here.

Tried @haarchri suggestion but the manual provider doesn't come to a HEALTY: True state because it says that the automatic one still exists.

status:
  conditions:
    - lastTransitionTime: '2024-08-09T12:18:03Z'
      message: >-
        cannot resolve package dependencies: cannot initialize dependency graph
        from the packages in the lock: node
        xpkg.upbound.io/upbound/provider-family-aws already exists
      reason: UnknownPackageRevisionHealth
      status: Unknown
      type: Healthy

I need to manually install the provider-family-aws because I realized that I'm running it on version v0.38.0, while the other providers (S3, Route53, etc.) are on version v1.0.0. Honestly, I don't know what happened, but it seems that even if I update the version of the Upbound AWS Providers image, provider-family-aws still remains on the old version and continues to install automatically with that version.

haarchri commented 1 month ago

Can you remove you Lock Lock resource remove the finalizer - can you send Provider and Providerrevision ?

darioef commented 1 month ago

You're the man. It worked!

So, the steps are the same as https://github.com/crossplane-contrib/provider-upjet-aws/issues/1452#issuecomment-2275922781 but you need to remove the Lock resource after you install the manual-provider-family-aws.

Thanks for your help.

turkenh commented 1 month ago

The complication here originated from deploying the same provider package twice (one already existing as a dependency and another installed manually). In the PR description, both provider-family-aws and upbound-provider-family-aws try to deploy xpkg.upbound.io/upbound/provider-family-aws under the hood and conflict with each other.

Ideally, if you have the family provider already deployed as a dependency and you want to change something, e.g. configure a DeploymentRuntimeConfig, the path to go should be editing/patching the already existing provider instead of deploying a separate Provider object with a different name. In the scenario here, the provider named upbound-provider-family-aws could be modified with spec.runtimeConfigRef in the first place.

darioef commented 1 month ago

Thanks for your explanation, in fact, another reason why I want to manually install the provider-family-aws is that when it's installed via a Provider dependency (e.g., the S3 provider), it doesn't respect the configured DeploymentRuntimeConfig and instead applies the default one. I need to use my RuntimeConfig because I've set tolerations and other deployment settings there.

I think this comment talk about the same problem: https://github.com/crossplane-contrib/provider-upjet-aws/issues/1088#issuecomment-2211198483

pierluigilenoci commented 3 weeks ago

@haarchri, thank you a lot.

I managed to clean the clusters but not with little difficulty because the suggestion was not 100% working.

The complete list of operations needed were:

Often, this was enough. Sometimes, I had to do it twice to clean the cluster.

@turkenh, this is not a complication but a plausible scenario.

If anyone initially installs the AWS Family provider but then realizes, for whatever reason, the need to assign a DeploymentRuntimeConfig to the automatically created provider and finish precisely in this situation.

Manually editing the provider is not a plausible solution in a fully GitOps approach, so a more integrated solution makes profound sense.

Furthermore, users should be able to choose their provider's name without impositions or hard-coded names.

This bug is far from being solved because what we have is just a workaround, not a fix for the problem.