opensearch-project / opensearch-k8s-operator

OpenSearch Kubernetes Operator
Apache License 2.0
405 stars 218 forks source link

Configuration options for securityconfig-update pod (e.g. nodeSelector) #559

Open danielhoult opened 1 year ago

danielhoult commented 1 year ago

Hi Team,

When trying deploy OpenSearch to a k8s cluster with multiple node pools, the default securityconfig-update pod is unable to start when it is scheduled on a windows node pool. The nodeselector option is available for the bootstrap, dashboard, and nodes sections, but not for the securityconfig-update pod.

To work around the issue I updated the CRDs:

opensearch-operator/api/v1/opensearch_types.go

// new config section for the securityconfig
type SecurityUpdateConfig struct {
    NodeSelector map[string]string `json:"nodeSelector,omitempty"`
}

// add to ClusterSpec
type ClusterSpec struct {
...
    // new SecurityUpdate spec to support nodeSelector
    SecurityUpdate SecurityUpdateConfig `json:"securityUpdate,omitempty"`
...
}```

edit NewSecurityconfigUpdateJob in opensearch-operator/pkg/builders/cluster.go to add NodeSelector in the Job definition
```...
return batchv1.Job{
        ObjectMeta: metav1.ObjectMeta{Name: jobName, Namespace: namespace, Annotations: annotations},
        Spec: batchv1.JobSpec{
            BackoffLimit: &backoffLimit,
            Template: corev1.PodTemplateSpec{
                ObjectMeta: metav1.ObjectMeta{Name: jobName},
                Spec: corev1.PodSpec{
                    TerminationGracePeriodSeconds: &terminationGracePeriodSeconds,
                    Containers: []corev1.Container{{
                        Name:            "updater",
                        Image:           image.GetImage(),
                        ImagePullPolicy: image.GetImagePullPolicy(),
                        Command:         []string{"/bin/bash", "-c"},
                        Args:            []string{cmdArg},
                        VolumeMounts:    volumeMounts,
                        SecurityContext: securityContext,
                    }},
                    ServiceAccountName: instance.Spec.General.ServiceAccount,
                    Volumes:            volumes,
                    RestartPolicy:      corev1.RestartPolicyNever,
                    NodeSelector:       instance.Spec.SecurityUpdate.NodeSelector,
                    ImagePullSecrets:   image.ImagePullSecrets,
                    SecurityContext:    podSecurityContext,
                },
            },
        },
    }

I contemplated trying to use the NodeSelector from one of the other config sections, but it didn't feel right.

I was then able to build the code and docker image, push the image to my private container registry, and have the cluster use this image when deploying the OpenSearch cluster. I do not have experience with go, so would take more time to understand how to write a test and submit a PR, but I wanted to share the solution that I have so far.

Thanks Daniel

swoehrl-mw commented 1 year ago

Hi @danielhoult. Thanks for reporting your solution. If you ever feel up to it, PRs are always welcome. Otherwise this issue will have to wait until someone else picks it up.

kinseii commented 1 year ago

I also need nodeSelector and tolerations to be assigned to securityconfig jobs

ArDark-Flant commented 1 year ago

Same here, would be nice to have it. Then we can use it in cluster where not exists nodes without taints.