Open mgerba opened 2 years ago
Can you check this is not related to #218? If not we will label this as a bug.
Update with new bug and solution.
The use case is to run a secured cluster with oidc authentication. #218 enables secure clustering, but when OIDC is added it causes errors because of the changes tot he authorizers.xml file.
The first issue is the numbering scheme. NiFi requires unique "Initial User Identity 1" , "Initial User Identity 2", ect
The existing statefulset.yaml file creates an authirizers.xml file with duplicates which will crash the nifi nodes when more than one node is added to the cluster.
authorizers.xml output section:
<userGroupProvider>
<identifier>file-user-group-provider</identifier>
<class>org.apache.nifi.authorization.FileUserGroupProvider</class>
<property name="Users File">./conf/users.xml</property>
<property name="Legacy Authorized Users File"/>
<property name="Initial User Identity 1">nifi@example.com</property>
<property name="Initial User Identity 0">OU=NIFI, CN=nifi-1.nifi-headless.default.svc.cluster.local</property>
<property name="Initial User Identity 1">OU=NIFI, CN=nifi-2.nifi-headless.default.svc.cluster.local</property>
<property name="Initial User Identity 2">OU=NIFI, CN=nifi-3.nifi-headless.default.svc.cluster.local</property>
</userGroupProvider>
To solve this I mad a change to the statefulset.yaml on line 187 adding 2
-- --value "Initial User Identity {{ . }}" \
++ --value "Initial User Identity {{ add 2 . }}" \
{{- if .Values.certManager.enabled }}
xmlstarlet ed --inplace --delete "authorizers/accessPolicyProvider/property[@name='Node Identity 1']" "${NIFI_HOME}/conf/authorizers.xml"
{{ range untilStep 0 (int .Values.replicaCount) 1 }}
xmlstarlet ed --inplace \
--subnode "authorizers/accessPolicyProvider" --type 'elem' -n 'property' \
--value "OU=NIFI, CN={{ template "apache-nifi.fullname" $ }}-{{ . }}.{{ template "apache-nifi.fullname" $ }}-headless.{{ $.Release.Namespace }}.svc.{{ $.Values.certManager.clusterDomain }}" \
--insert "authorizers/accessPolicyProvider/property[not(@name)]" --type attr -n name \
--value "Node Identity {{ . }}" \
"${NIFI_HOME}/conf/authorizers.xml"
xmlstarlet ed --inplace \
--subnode "authorizers/userGroupProvider" --type 'elem' -n 'property' \
--value "OU=NIFI, CN={{ template "apache-nifi.fullname" $ }}-{{ . }}.{{ template "apache-nifi.fullname" $ }}-headless.{{ $.Release.Namespace }}.svc.{{ $.Values.certManager.clusterDomain }}" \
--insert "authorizers/userGroupProvider/property[not(@name)]" --type attr -n name \
--value "Initial User Identity {{ add 2 . }}" \
"${NIFI_HOME}/conf/authorizers.xml"
{{/* range untilStep 0 (int .Values.replicaCount ) 1 */}}{{ end }}
This changed the autherizers.xml to:
<userGroupProvider>
<identifier>file-user-group-provider</identifier>
<class>org.apache.nifi.authorization.FileUserGroupProvider</class>
<property name="Users File">./conf/users.xml</property>
<property name="Legacy Authorized Users File"/>
<property name="Initial User Identity 1">nifi@example.com</property>
<property name="Initial User Identity 2">OU=NIFI, CN=nifi-1.nifi-headless.default.svc.cluster.local</property>
<property name="Initial User Identity 3">OU=NIFI, CN=nifi-2.nifi-headless.default.svc.cluster.local</property>
<property name="Initial User Identity 4">OU=NIFI, CN=nifi-3.nifi-headless.default.svc.cluster.local</property>
</userGroupProvider>
This caused a different issue with the OIDC login showing and untrusted proxy.
The error on the NiFi UI:
Untrusted proxy CN=nifi-0.nifi-headless.default.svc.cluster.local, OU=NIFI
The error on the nifi userlog:
NiFiAuthenticationFilter Rejecting access to web api: Untrusted proxy CN=nifi-0.nifi-headless.default.svc.cluster.local, OU=NIFI
This is caused because the order of CN, OU in the authorizers.xml is reversed. To remediate that I modified the statefulset.yaml file again (lines 179 + 185:
-- --value "OU=NIFI, CN={{ template "apache-nifi.fullname" $ }}-{{ . }}.{{ template "apache-nifi.fullname" $ }}-headless.{{ $.Release.Namespace }}.svc.{{ $.Values.certManager.clusterDomain }}" \
++ --value "CN={{ template "apache-nifi.fullname" $ }}-{{ . }}.{{ template "apache-nifi.fullname" $ }}-headless.{{ $.Release.Namespace }}.svc.{{ $.Values.certManager.clusterDomain }}, OU=NIFI" \
{{- if .Values.certManager.enabled }}
xmlstarlet ed --inplace --delete "authorizers/accessPolicyProvider/property[@name='Node Identity 1']" "${NIFI_HOME}/conf/authorizers.xml"
{{ range untilStep 0 (int .Values.replicaCount) 1 }}
xmlstarlet ed --inplace \
--subnode "authorizers/accessPolicyProvider" --type 'elem' -n 'property' \
--value "CN={{ template "apache-nifi.fullname" $ }}-{{ . }}.{{ template "apache-nifi.fullname" $ }}-headless.{{ $.Release.Namespace }}.svc.{{ $.Values.certManager.clusterDomain }}, OU=NIFI" \
--insert "authorizers/accessPolicyProvider/property[not(@name)]" --type attr -n name \
--value "Node Identity {{ . }}" \
"${NIFI_HOME}/conf/authorizers.xml"
xmlstarlet ed --inplace \
--subnode "authorizers/userGroupProvider" --type 'elem' -n 'property' \
--value "CN={{ template "apache-nifi.fullname" $ }}-{{ . }}.{{ template "apache-nifi.fullname" $ }}-headless.{{ $.Release.Namespace }}.svc.{{ $.Values.certManager.clusterDomain }}, OU=NIFI" \
--insert "authorizers/userGroupProvider/property[not(@name)]" --type attr -n name \
--value "Initial User Identity {{ add 2 . }}" \
"${NIFI_HOME}/conf/authorizers.xml"
{{/* range untilStep 0 (int .Values.replicaCount ) 1 */}}{{ end }}
The new authorizers.xml output is:
-->
<userGroupProvider>
<identifier>file-user-group-provider</identifier>
<class>org.apache.nifi.authorization.FileUserGroupProvider</class>
<property name="Users File">./conf/users.xml</property>
<property name="Legacy Authorized Users File"/>
<property name="Initial User Identity 1">nifi@example.com</property>
<property name="Initial User Identity 2">CN=nifi-0.nifi-headless.default.svc.cluster.local, OU=NIFI</property>
<property name="Initial User Identity 3">CN=nifi-1.nifi-headless.default.svc.cluster.local, OU=NIFI</property>
<property name="Initial User Identity 4">CN=nifi-2.nif-headless.default.svc.cluster.local, OU=NIFI</property>
</userGroupProvider>
<!--
With the two changes to the statefulset.yaml file and the values file below, I was able to deploy a NiFi 3 node cluster secured with kyecloak authentication using ingress for each service and cert manager for the nifi certificates and sucsessfully log in. I hope this helps.
Values.yaml config:
---
# Number of nifi nodes
replicaCount: 3
## Set default image, imageTag, and imagePullPolicy.
## ref: https://hub.docker.com/r/apache/nifi/
##
image:
repository: apache/nifi
tag: "1.14.0"
pullPolicy: "IfNotPresent"
## Optionally specify an imagePullSecret.
## Secret must be manually created in the namespace.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
##
# pullSecret: myRegistrKeySecretName
securityContext:
runAsUser: 1000
fsGroup: 1000
## @param useHostNetwork - boolean - optional
## Bind ports on the hostNetwork. Useful for CNI networking where hostPort might
## not be supported. The ports need to be available on all hosts. It can be
## used for custom metrics instead of a service endpoint.
##
## WARNING: Make sure that hosts using this are properly firewalled otherwise
## metrics and traces are accepted from any host able to connect to this host.
#
sts:
# Parallel podManagementPolicy for faster bootstrap and teardown. Default is OrderedReady.
podManagementPolicy: Parallel
AntiAffinity: soft
useHostNetwork: null
hostPort: null
pod:
annotations:
security.alpha.kubernetes.io/sysctls: net.ipv4.ip_local_port_range=10000 65000
#prometheus.io/scrape: "true"
serviceAccount:
create: false
#name: nifi
annotations: {}
hostAliases: []
# - ip: "1.2.3.4"
# hostnames:
# - example.com
# - example
## Useful if using any custom secrets
## Pass in some secrets to use (if required)
# secrets:
# - name: myNifiSecret
# keys:
# - key1
# - key2
# mountPath: /opt/nifi/secret
## Useful if using any custom configmaps
## Pass in some configmaps to use (if required)
# configmaps:
# - name: myNifiConf
# keys:
# - myconf.conf
# mountPath: /opt/nifi/custom-config
properties:
# use externalSecure for when inbound SSL is provided by nginx-ingress or other external mechanism
sensitiveKey: changeMechangeMe # Must to have minimal 12 length key
algorithm: NIFI_PBKDF2_AES_GCM_256
externalSecure: true
isNode: true
httpsPort: 8443
webProxyHost: <target.url.com> # <clusterIP>:<NodePort> (If Nifi service is NodePort or LoadBalancer)
clusterPort: 6007
provenanceStorage: "8 GB"
siteToSite:
port: 10000
# use properties.safetyValve to pass explicit 'key: value' pairs that overwrite other configuration
safetyValve:
#nifi.variable.registry.properties: "${NIFI_HOME}/example1.properties, ${NIFI_HOME}/example2.properties"
nifi.web.http.network.interface.default: eth0
# listen to loopback interface so "kubectl port-forward ..." works
nifi.web.http.network.interface.lo: lo
## Include aditional processors
# customLibPath: "/opt/configuration_resources/custom_lib"
## Include additional libraries in the Nifi containers by using the postStart handler
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/
# postStart: /opt/nifi/psql; wget -P /opt/nifi/psql https://jdbc.postgresql.org/download/postgresql-42.2.6.jar
# Nifi User Authentication
auth:
admin: CN=admin, OU=NIFI
SSL:
keystorePasswd: changeMe
truststorePasswd: changeMe
# Automaticaly disabled if OIDC or LDAP enabled
singleUser:
username: username
password: changemechangeme # Must to have at least 12 characters
ldap:
enabled: false
host: #ldap://<hostname>:<port>
searchBase: #CN=Users,DC=ldap,DC=example,DC=be
admin: #cn=admin,dc=ldap,dc=example,dc=be
pass: #ChangeMe
searchFilter: (objectClass=*)
userIdentityAttribute: cn
authStrategy: SIMPLE # How the connection to the LDAP server is authenticated. Possible values are ANONYMOUS, SIMPLE, LDAPS, or START_TLS.
identityStrategy: USE_DN
authExpiration: 12 hours
oidc:
enabled: true
discoveryUrl: http://<keycloak.target.url>/auth/realms/nifi/.well-known/openid-configuration
clientId: nifi
clientSecret: <client-secret>
admin: nifi@example.com
claimIdentifyingUser: email
## Request additional scopes, for example profile
additionalScopes:
openldap:
enabled: false
persistence:
enabled: true
env:
LDAP_ORGANISATION: # name of your organization e.g. "Example"
LDAP_DOMAIN: # your domain e.g. "ldap.example.be"
LDAP_BACKEND: "hdb"
LDAP_TLS: "true"
LDAP_TLS_ENFORCE: "false"
LDAP_REMOVE_CONFIG_AFTER_SETUP: "false"
adminPassword: #ChengeMe
configPassword: #ChangeMe
customLdifFiles:
1-default-users.ldif: |-
# You can find an example ldif file at https://github.com/cetic/fadi/blob/master/examples/basic/example.ldif
## Expose the nifi service to be accessed from outside the cluster (LoadBalancer service).
## or access it from within the cluster (ClusterIP service). Set the service type and the port to serve it.
## ref: http://kubernetes.io/docs/user-guide/services/
##
# headless service
headless:
type: ClusterIP
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
# ui service
service:
type: ClusterIP
httpsPort: 8443
# nodePort: 30236
annotations: {}
# loadBalancerIP:
## Load Balancer sources
## https://kubernetes.io/docs/tasks/access-application-cluster/configure-cloud-provider-firewall/#restrict-access-for-loadbalancer-service
##
# loadBalancerSourceRanges:
# - 10.10.10.0/24
## OIDC authentication requires "sticky" session on the LoadBalancer for JWT to work properly...but AWS doesn't like it on creation
# sessionAffinity: ClientIP
# sessionAffinityConfig:
# clientIP:
# timeoutSeconds: 10800
# Enables additional port/ports to nifi service for internal processors
processors:
enabled: false
ports:
- name: processor01
port: 7001
targetPort: 7001
#nodePort: 30701
- name: processor02
port: 7002
targetPort: 7002
#nodePort: 30702
## Configure Ingress based on the documentation here: https://kubernetes.io/docs/concepts/services-networking/ingress/
##
ingress:
enabled: true
annotations:
ingress.kubernetes.io/ssl-redirect: "true"
kubernetes.io/tls-acme: "true"
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/affinity: cookie
nginx.ingress.kubernetes.io/upstream-vhost: "localhost:8443"
nginx.ingress.kubernetes.io/proxy-redirect-from: "https://localhost:8443"
nginx.ingress.kubernetes.io/proxy-redirect-to: "https://<target.url.com>"
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
tls:
- hosts:
- <target.url.com>
secretName: nifi-secret
hosts:
- <target.url.com>
path: /
# If you want to change the default path, see this issue https://github.com/cetic/helm-nifi/issues/22
# Amount of memory to give the NiFi java heap
jvmMemory: 2g
# Separate image for tailing each log separately and checking zookeeper connectivity
sidecar:
image: busybox
tag: "1.32.0"
imagePullPolicy: "IfNotPresent"
## Enable persistence using Persistent Volume Claims
## ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
##
persistence:
enabled: false
# When creating persistent storage, the NiFi helm chart can either reference an already-defined
# storage class by name, such as "standard" or can define a custom storage class by specifying
# customStorageClass: true and providing the "storageClass", "storageProvisioner" and "storageType".
# For example, to use SSD storage on Google Compute Engine see values-gcp.yaml
#
# To use a storage class that already exists on the Kubernetes cluster, we can simply reference it by name.
# For example:
# storageClass: standard
#
# The default storage class is used if this variable is not set.
accessModes: [ReadWriteOnce]
## Storage Capacities for persistent volumes
configStorage:
size: 100Mi
authconfStorage:
size: 100Mi
# Storage capacity for the 'data' directory, which is used to hold things such as the flow.xml.gz, configuration, state, etc.
dataStorage:
size: 1Gi
# Storage capacity for the FlowFile repository
flowfileRepoStorage:
size: 10Gi
# Storage capacity for the Content repository
contentRepoStorage:
size: 10Gi
# Storage capacity for the Provenance repository. When changing this, one should also change the properties.provenanceStorage value above, also.
provenanceRepoStorage:
size: 10Gi
# Storage capacity for nifi logs
logStorage:
size: 5Gi
## Configure resource requests and limits
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
##
resources: {}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
logresources:
requests:
cpu: 10m
memory: 10Mi
limits:
cpu: 50m
memory: 50Mi
## Enables setting your own affinity. Mutually exclusive with sts.AntiAffinity
## You need to set the value of sts.AntiAffinity other than "soft" and "hard"
affinity: {}
nodeSelector: {}
tolerations: []
initContainers: {}
# foo-init: # <- will be used as container name
# image: "busybox:1.30.1"
# imagePullPolicy: "IfNotPresent"
# command: ['sh', '-c', 'echo this is an initContainer']
# volumeMounts:
# - mountPath: /tmp/foo
# name: foo
extraVolumeMounts: []
extraVolumes: []
## Extra containers
extraContainers: []
terminationGracePeriodSeconds: 30
## Extra environment variables that will be pass onto deployment pods
env: []
## Extra environment variables from secrets and config maps
envFrom: []
# envFrom:
# - configMapRef:
# name: config-name
# - secretRef:
# name: mysecret
## Openshift support
## Use the following varables in order to enable Route and Security Context Constraint creation
openshift:
scc:
enabled: false
route:
enabled: false
#host: www.test.com
#path: /nifi
# ca server details
# Setting this true would create a nifi-toolkit based ca server
# The ca server will be used to generate self-signed certificates required setting up secured cluster
ca:
## If true, enable the nifi-toolkit certificate authority
enabled: false
persistence:
enabled: true
server: ""
service:
port: 9090
token: sixteenCharacters
admin:
cn: admin
serviceAccount:
create: false
#name: nifi-ca
openshift:
scc:
enabled: false
# cert-manager support
# Setting this true will have cert-manager create a private CA for the cluster
# as well as the certificates for each cluster node.
# Note that https://github.com/spoditor/spoditor is required!
certManager:
enabled: true
clusterDomain: cluster.local
keystorePasswd: changeme
truststorePasswd: changeme
additionalDnsNames:
- nifi.kube.cssp.io
refreshSeconds: 300
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 100m
memory: 128Mi
# ------------------------------------------------------------------------------
# Zookeeper:
# ------------------------------------------------------------------------------
zookeeper:
## If true, install the Zookeeper chart
## ref: https://github.com/bitnami/charts/blob/master/bitnami/zookeeper/values.yaml
enabled: true
## If the Zookeeper Chart is disabled a URL and port are required to connect
url: ""
port: 2181
replicaCount: 3
# ------------------------------------------------------------------------------
# Nifi registry:
# ------------------------------------------------------------------------------
registry:
## If true, install the Nifi registry
enabled: true
url: ""
port: 80
## Add values for the nifi-registry here
## ref: https://github.com/dysnix/charts/blob/master/nifi-registry/values.yaml
flowProvider:
postgres:
enabled: false
# Configure metrics
metrics:
prometheus:
# Enable Prometheus metrics
enabled: false
# Port used to expose Prometheus metrics
port: 9092
serviceMonitor:
# Enable deployment of Prometheus Operator ServiceMonitor resource
enabled: false
# namespace: monitoring
# Additional labels for the ServiceMonitor
labels: {}
@wknickless could you please check this before the PR #218 review ?
@mgerba wrote: "With the two changes to the statefulset.yaml file and the values file below, I was able to deploy a NiFi 3 node cluster secured with kyecloak authentication using ingress for each service and cert manager for the nifi certificates and sucsessfully log in."
@zakaria2905 that's exactly what the #218 GitHub workflow test named "OIDC (cluster, Ingress, cert-manager local issuer)" confirms as working. We also have #218 running in our dev cluster (3 nodes), test cluster (8 nodes), and prod cluster (5 nodes).
I have been reading the current comments and I am not finding an answer to my problem. When the CA is set to enabled the config has no references to pull in the certificats from the CA by default. Is there a reason this doesn't exist. I appreciate the framework you have created. The nifi nodes continue to generate their own certificates and keystone/trust stores.
2022/01/11 16:11:38 INFO [main] org.apache.nifi.toolkit.tls.standalone.TlsToolkitStandaloneCommandLine: Using /opt/nifi/nifi-current/conf/nifi.properties as template. 2022/01/11 16:11:39 INFO [main] org.apache.nifi.toolkit.tls.standalone.TlsToolkitStandalone: Running standalone certificate generation with output directory /opt/nifi/nifi-current/conf 2022/01/11 16:11:39 INFO [main] org.apache.nifi.toolkit.tls.standalone.TlsToolkitStandalone: Generated new CA certificate /opt/nifi/nifi-current/conf/nifi-cert.pem and key /opt/nifi/nifi-current/conf/nifi-key.key 2022/01/11 16:11:39 INFO [main] org.apache.nifi.toolkit.tls.standalone.TlsToolkitStandalone: Writing new ssl configuration to /opt/nifi/nifi-current/conf/nifi-2-nifi-0.nifi-2-nifi-headless.default.svc.cluster.local 2022/01/11 16:11:40 INFO [main] org.apache.nifi.toolkit.tls.standalone.TlsToolkitStandalone: Successfully generated TLS configuration for nifi-2-nifi-0.nifi-2-nifi-headless.default.svc.cluster.local 1 in /opt/nifi/nifi-current/conf/nifi-2-nifi-0.nifi-2-nifi-headless.default.svc.cluster.local 2022/01/11 16:11:40 INFO [main] org.apache.nifi.toolkit.tls.standalone.TlsToolkitStandalone: Generating new client certificate /opt/nifi/nifi-current/conf/CN=admin_OU=NIFI.p12 2022/01/11 16:11:40 INFO [main] org.apache.nifi.toolkit.tls.standalone.TlsToolkitStandalone: Successfully generated client certificate /opt/nifi/nifi-current/conf/CN=admin_OU=NIFI.p12 2022/01/11 16:11:40 INFO [main] org.apache.nifi.toolkit.tls.standalone.TlsToolkitStandalone: tls-toolkit standalone completed successfully
The nodes attempt to cluster but receive a certificate trust error:
2022-01-11 16:13:30,421 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message due to: javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
There are no references to the enabled CA pod to pull the certificates that it generates. Is there any interest in building that into the satefulset.yaml or an option to pull in the cluster certificates into the truststore/keysore using openssl as a configurable option.
If I am misunderstanding and there are currently options in the default values file I am missing for enabling secured cluster trust.
My basic values.yaml: