Closed eli-kasa closed 1 year ago
Hi @eli-kasa. One of the most common errors is that Secret
is not labeled: kubectl label secret mongodb-atlas-operator-api-key atlas.mongodb.com/type=credentials -n mongodb-atlas-system
(https://www.mongodb.com/docs/atlas/reference/atlas-operator/ak8so-quick-start/#create-a-secret-with-your-api-keys-and-organization-id)
Another reason could be that you installed the operator only to the mongodb-atlas-system
namespace. Please check the WATCH_NAMESPACE
env variable in the operator Deployment
resource.
@igor-karpukhin Thanks, but yes, the operators credential secret is properly labeled as are all of the ref password secrets.
The Operator deployment has these env:
env:
- name: OPERATOR_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: OPERATOR_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
WATCH_NAMESPACE
is not defined. I'm also having trouble find documentation on this parameter, either here on github or on the official MongoDB documentation site... I can find similar var for the Enterprise K8s Operator... does that documentation apply to the Atlas Operator as well, and if I set WATCH_NAMESPACE
to *
will that address this issue? Not setting it already appears to have the behavior of watching everything, at least on create events...
For hopefully more clarity, as stated in the issue, everything creates fine IF you create the AtlasDatabaseUser
after the AtlasDeployment
, regardless of namespace (since the operator is not scoped to one by WATCH_NAMESPACE
and appears to have a cluster role + binding that grants what appears to be needed), it will create the corresponding {project}-{cluster}-{user}
secret
with connection strings for the deployment(s) in the same project, in the associated namespaces. What doesn't happen is if there is an existing AtlasDatabaseUser
defined in a namespace other than where the operator is deployed, it will not reconcile/create those secrets
for users in the other namespaces (unless you are creating the users AFTER the deployment).
Also, using helm to deploy. Here is the operators deployment manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: '1'
meta.helm.sh/release-name: atlas-operator
meta.helm.sh/release-namespace: mongodb-atlas-system
creationTimestamp: '2023-02-23T06:15:14Z'
generation: 1
labels:
app.kubernetes.io/instance: atlas-operator
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: mongodb-atlas-operator
app.kubernetes.io/version: 1.6.1
helm.sh/chart: mongodb-atlas-operator-1.6.1
name: mongodb-atlas-operator
namespace: mongodb-atlas-system
resourceVersion: '8506866'
uid: {a-guid}
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: atlas-operator
app.kubernetes.io/name: mongodb-atlas-operator
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: atlas-operator
app.kubernetes.io/name: mongodb-atlas-operator
spec:
containers:
- args:
- --atlas-domain=https://cloud.mongodb.com/
- --health-probe-bind-address=:8081
- --metrics-bind-address=:8080
- --leader-elect
command:
- /manager
env:
- name: OPERATOR_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: OPERATOR_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: mongodb/mongodb-atlas-kubernetes-operator:1.6.1
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8081
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 20
successThreshold: 1
timeoutSeconds: 1
name: manager
ports:
- containerPort: 80
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: 8081
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 100m
memory: 50Mi
securityContext:
allowPrivilegeEscalation: false
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsNonRoot: true
runAsUser: 2000
serviceAccount: mongodb-atlas-operator
serviceAccountName: mongodb-atlas-operator
terminationGracePeriodSeconds: 10
status:
availableReplicas: 1
conditions:
- lastTransitionTime: '2023-02-23T06:15:14Z'
lastUpdateTime: '2023-02-23T06:15:35Z'
message: ReplicaSet "mongodb-atlas-operator-694759b8fc" has successfully progressed.
reason: NewReplicaSetAvailable
status: 'True'
type: Progressing
- lastTransitionTime: '2023-03-15T17:02:40Z'
lastUpdateTime: '2023-03-15T17:02:40Z'
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: 'True'
type: Available
observedGeneration: 1
readyReplicas: 1
replicas: 1
updatedReplicas: 1
As a work around I've resorted to deleting and re-creating the AtlasDatabaseUser
which then does generate the secrets
(for both deployments) as expected, in the default
namespace...
HI @eli-kasa
Thank you for your feedback and extensive report. We were able to reproduce the bug and we are working on a fix, which should come in the 1.8.0 release.
What did you do to encounter the bug?
AtlasProject
(namespace: mongodb-atlas-system)AtlasDeployment
"dep1" (namespace: mongodb-atlas-system)Secret
password Ref andAtlasDatabaseUser
user1 (namespace: default)Secret
project-dep1-user1 exists (namespace: default)Secret
password Ref andAtlasDatabaseUser
user2 (namespace: mongodb-atlas-system)Secret
project-dep2-user2 exists (namespace: mongodb-atlas-system)AtlasDeployment
"dep2" (namespace: mongodb-atlas-system)Secret
project-dep2-user2 exists (namespace: mongodb-atlas-system)Secret
project-dep2-user1 (namespace: default)What did you expect?
A
Secret
project-dep2-user1 (namespace: default) to be created.What happened instead?
When the new AtlasDeployment was created, the existing AtlasDatabaseUser in the same namespace had a credentials + connection string Secret created, but AtlasDatabaseUser resources in other namespaces did not.
Operator Information
Kubernetes Cluster Information
Additional context
Of note, the operator doesn't seem to log any information about the
Secret
reconciliation that occurred forAtlasDatabaseUser
user2 in the same namespace