Add some sanity check for controller credentials

invidian commented 3 years ago

/kind feature

Describe the solution you'd like Right now, if one deploys CAPZ using Tilt and forgets to specify some credentials via environment variables, CAPz controller runs as usual until you try deploying a cluster, which then fails.

It would be nice to do any kind of check for credentials and either reject cluster operations or crash if the credentials are wrong. This would provide better experience for people deploying and developing CAPz.

Environment:

cluster-api-provider-azure version: 02ca6c383c4c459efb79c8d3673f332f212f4046

devigned commented 3 years ago

The CAPZ controller does not require credentials. In fact, the controller credentials will be deprecated in a future version.

Each cluster should have it's own identity reference. Please take a look at https://capz.sigs.k8s.io/topics/multitenancy.html. wdyt?

invidian commented 3 years ago

Each cluster should have it's own identity reference. Please take a look at https://capz.sigs.k8s.io/topics/multitenancy.html. wdyt?

This makes it IMO even more important to do sanity checks, if it's possibly up to the user to provide the credentials. Is there anything like this in place?

devigned commented 3 years ago

What would you think about validating the cluster by checking the ENV for creds and if missing, require the cluster to have an identity ref?

@nader-ziada wdyt?

nader-ziada commented 3 years ago

the identityRef is part of the workload cluster definition, so that check would happen at the creation time of that cluster which would not help with the problem described here in this issue.

The problem with checking the credentials at the time of creating the management cluster is that it would require an extra call to azure to validate its values, which would be an extra step that some would argue is not needed, otherwise just checking for the existence of some random values would not necessarily be valuable

devigned commented 3 years ago

the identityRef is part of the workload cluster definition, so that check would happen at the creation time of that cluster which would not help with the problem described here in this issue.

I was just saying we check for a valid set of env vars, like AZURE_TENANT_ID, AZURE_CLIENT_ID and AZURE_CLIENT_SECRET are not "" when a workload cluster is being created. If there isn't a valid set of env vars and no identityRef, reject the cluster.

I agree we would have a tough time figuring out if creds are valid, but we have an easy time figuring out if they are present. If we know they are not present, then we know that a workload cluster will not be able to be deployed if there is no identityRef.

I think that will be a common error case until identityRef is the only way.

nader-ziada commented 3 years ago

I think that will be a common error case until identityRef is the only way.

yes, makes sense

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

invidian commented 3 years ago

/remove-lifecycle stale

fiunchinho commented 3 years ago

wouldn't be this function validating the credentials as requested?

invidian commented 3 years ago

wouldn't be this function validating the credentials as requested?

That's a decent short term solution, though I think something more nested in the controller itself would be nice as well.

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian commented 3 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

shysank commented 2 years ago

/remove-lifecycle stale

@invidian we already validate for the existence of AZURE_CLIENT_ID and AZURE_CLIENT_SECRET env vars. Additionally, Capz expects AZURE_CLIENT_SECRET to be set as a kubernetes secret as well. Do you think that checking that will improve the experience?

invidian commented 2 years ago

@shysank is it validated by the controller itself or only via Tilt/clusterctl?

shysank commented 2 years ago

Validation is (will be) done on the client (tilt), Controller will return an error if it can't find the secret.

invidian commented 2 years ago

Hmm, based on my recent development work, I still think the situation could be improved. Right now if one passes wrong credentials, for example with no access to subscription at all and tries to create a cluster, the cluster cannot even be removed, which is quite annoying as one have to remove the finalizers by hand to get rid of the objects.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

jackfrancis commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian commented 1 year ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

invidian commented 1 year ago

/remove-lifecycle stale

k8s-triage-robot commented 9 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 8 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

nawazkh commented 8 months ago

/remove-lifecycle rotten

k8s-triage-robot commented 6 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 6 months ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/1238#issuecomment-2094302714): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

kubernetes-sigs / cluster-api-provider-azure

Add some sanity check for controller credentials #1238