Orange-OpenSource / nifikop

The NiFiKop NiFi Kubernetes operator makes it easy to run Apache NiFi on Kubernetes. Apache NiFI is a free, open-source solution that support powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
https://orange-opensource.github.io/nifikop/
Apache License 2.0
128 stars 34 forks source link

Plain secure cluster setup not working #26

Closed Arttii closed 4 years ago

Arttii commented 4 years ago

Type of question

Are you asking about community best practices, how to implement a specific feature, or about general context and help around nifikop ?

Question

What did you do?

Deploying nifiko+cert-manager and deploying the following crds:

apiVersion: nifi.orange.com/v1alpha1
kind: NifiCluster
metadata:
  name: sslnifi
  namespace: usecase
spec:
  service:
    headlessEnabled: true
  zkAddresse: "zookeeper.usecase:2181"
  zkPath: "/ssllnifi"
  clusterImage: "apache/nifi:1.11.4"
  clusterSecure: true
  siteToSiteSecure: true
  oneNifiNodePerNode: false
  initialAdminUser: nifi-admin@reply.de
  propagateLabels: true
  nifiClusterTaskSpec:
    retryDurationMinutes: 10
  readOnlyConfig:
    # NifiProperties configuration that will be applied to the node.
    nifiProperties:
      webProxyHosts:
        - some-url:8443
      # Additionnals nifi.properties configuration that will override the one produced based
      # on template and configurations.
      overrideConfigs: |
        nifi.security.user.oidc.discovery.url=https://keycloak.url/auth/realms/dapc/.well-known/openid-configuration
        nifi.security.user.oidc.client.id=nifi
        nifi.security.user.oidc.client.secret=token

        nifi.security.identity.mapping.pattern.dn=CN=([^,]*)(?:, (?:O|OU)=.*)?
        nifi.security.identity.mapping.value.dn=$1
        nifi.security.identity.mapping.transform.dn=NONE
  nodeConfigGroups:
    default_group:
      isNode: true
      storageConfigs:
        - mountPath: "/opt/nifi/nifi-current/logs"
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "ceph-block-storage"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/data"
          name: data
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "ceph-block-storage"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/flowfile_repository"
          name: flowfile-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "ceph-block-storage"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/nifi-current/conf"
          name: conf
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "ceph-block-storage"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/content_repository"
          name: content-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "ceph-block-storage"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/provenance_repository"
          name: provenance-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "ceph-block-storage"
            resources:
              requests:
                storage: 10Gi
      serviceAccountName: "default"
      resourcesRequirements:
        limits:
          cpu: "1"
          memory: 3Gi
        requests:
          cpu: "0.1"
          memory: 0.5Gi
  nodes:
    - id: 0
      nodeConfigGroup: "default_group"
    - id: 1
      nodeConfigGroup: "default_group"
    - id: 2
      nodeConfigGroup: "default_group"
  listenersConfig:
    internalListeners:
      - type: "https"
        name: "https"
        containerPort: 8443
      - type: "cluster"
        name: "cluster"
        containerPort: 6007
      - type: "s2s"
        name: "s2s"
        containerPort: 10000
    sslSecrets:
      tlsSecretName: "test-nifikop"
      create: true

What did you expect to see? A secure cluster starting and working

What did you see instead? Under which circumstances? The nifi-nodes start, but nothing is really reachable on 8443. When doing kubectl -n usecase port-forward svc/sslnifi-headless 8443, I get An error occurred during a connection to localhost:8443. PR_END_OF_FILE_ERROR

What primarily confuses me and makes me think something is going wrong, is the following log from the operator:

2020-08-19T09:23:38.748733801Z {"level":"info","ts":1597829018.7486365,"logger":"controller_nificlustertask","msg":"nifi cluster communication error: could not connect to nifi nodes: sslnifi-headless.usecase.svc.cluster.local:8443: Non 200 response from nifi node: 403 Forbidden"}

Environment

erdrix commented 4 years ago

What primarily confuses me and makes me think something is going wrong, is the following log from the operator:

2020-08-19T09:23:38.748733801Z {"level":"info","ts":1597829018.7486365,"logger":"controller_nificlustertask","msg":"nificluster communication error: could not connect to nifi nodes: sslnifi-headless.usecase.svc.cluster.local:8443: Non 200 response from nifi node: 403 Forbidden"}

This error is ok, because you have not yet declare the operator user as being able to query the Nifi controller's API : https://orange-opensource.github.io/nifikop/docs/3_tasks/2_security/1_ssl#operator-access-policies

For the port-forwarding, it looks like it needs more configuration, but I still haven't found the right one :/

This first thing I'm pretty sure : you need to add this in Spec.ReadOnlyConfig.NifiProperties. overrideConfigs :

spec:
  ...
  readOnlyConfig:
    # NifiProperties configuration that will be applied to the node.
    nifiProperties:
      ...
      # Additionnals nifi.properties configuration that will override the one produced based
      # on template and configurations.
      overrideConfigs: |
      ...
      nifi.web.https.network.interface.eth0=eth0
      nifi.web.https.network.interface.lo=lo
      ...

But on my side that doesn't seems enough ...

Arttii commented 4 years ago

Thanks for the tip, settgin up a proper ingress and following your points lead to a cluster starting, but something seems to be crashing inside the nifi. Is there no easier way to view the logs apart from execing inside the container?

erdrix commented 4 years ago

Not yet, there is an issue #3 to give the ability to define custom sidecars into the NiFi node pods allowing you to define a sidecar reading the logs with a simple tail. I will set it to high priority and try to do it as ASAP (or if you want to do it, feel free :D)

Arttii commented 4 years ago

Its quite confusing, two nodes seem to come up, but then I am getting Purposed state does not match the stored state. Unable to continue login process., and I cannot configure the user for the operator as i cannot get access to the UI. The CRD has the following state:

apiVersion: nifi.orange.com/v1alpha1
kind: NifiCluster
metadata:
  annotations:
  creationTimestamp: '2020-08-19T13:55:26Z'
  generation: 1
  labels:
    app: nifi
    chart: nifi-0.2.0
    heritage: Helm
    kapp.k14s.io/app: '1597845323866920206'
    kapp.k14s.io/association: v1.c1ff3c2944280d1f5a56bb66f373db0b
    release: nifi
  name: nifi
  namespace: usecase
  resourceVersion: '9126287'
  selfLink: /apis/nifi.orange.com/v1alpha1/namespaces/usecase/nificlusters/nifi
  uid: 6c392eaa-d25d-4b79-a9ad-7d68e3c0a7b3
spec:
  clusterImage: 'apache/nifi:1.11.4'
  clusterSecure: true
  initialAdminUser: nifi-admin
  listenersConfig:
    internalListeners:
      - containerPort: 8443
        name: https
        type: https
      - containerPort: 6007
        name: cluster
        type: cluster
      - containerPort: 10000
        name: s2s
        type: s2s
    sslSecrets:
      create: true
      tlsSecretName: test-nifikop
  nifiClusterTaskSpec:
    retryDurationMinutes: 10
  nodeConfigGroups:
    default_group:
      isNode: true
      resourcesRequirements:
        limits:
          cpu: '1'
          memory: 4Gi
        requests:
          cpu: '0.1'
          memory: 0.5Gi
      serviceAccountName: default
      storageConfigs:
        - mountPath: /opt/nifi/nifi-current/logs
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
            storageClassName: ceph-block-storage
        - mountPath: /opt/nifi/data
          name: data
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
            storageClassName: ceph-block-storage
        - mountPath: /opt/nifi/flowfile_repository
          name: flowfile-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
            storageClassName: ceph-block-storage
        - mountPath: /opt/nifi/nifi-current/conf
          name: conf
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
            storageClassName: ceph-block-storage
        - mountPath: /opt/nifi/content_repository
          name: content-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
            storageClassName: ceph-block-storage
        - mountPath: /opt/nifi/provenance_repository
          name: provenance-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
            storageClassName: ceph-block-storage
  nodes:
    - id: 0
      nodeConfigGroup: default_group
    - id: 1
      nodeConfigGroup: default_group
  oneNifiNodePerNode: false
  propagateLabels: true
  readOnlyConfig:
    bootstrapProperties:
      nifiJvmMemory: 512m
      overrideConfigs: |
        # java.arg.4=-Djava.net.preferIPv4Stack=true
    nifiProperties:
      overrideConfigs: >
        nifi.security.user.oidc.discovery.url=https://keycloak.domain.de/auth/realms/dapc/.well-known/openid-configuration

        nifi.security.user.oidc.client.id=nifi

        nifi.security.user.oidc.client.secret=token

        nifi.web.https.network.interface.eth0=eth0

        nifi.web.https.network.interface.lo=lo

        nifi.security.identity.mapping.pattern.dn=CN=([^,]*)(?:, (?:O|OU)=.*)?

        nifi.security.identity.mapping.value.dn=$1

        nifi.security.identity.mapping.transform.dn=NONE
      webProxyHosts:
        - 'nifi.domain.de:443'
  service:
    headlessEnabled: true
  siteToSiteSecure: true
  zkAddresse: 'zookeeper.base.svc.cluster.local:2181'
  zkPath: /nifi-usecase
status:
  nodesState:
    '0':
      configurationState: ConfigOutOfSync
      gracefulActionState:
        TaskStarted: 'Wed, 19 Aug 2020 13:55:26 GMT'
        actionState: GracefulUpscaleRunning
        actionStep: CONNECTING
        errorMessage: ''
      initClusterNode: true
    '1':
      configurationState: ConfigInSync
      gracefulActionState:
        actionState: GracefulUpscaleRequired
        errorMessage: ''
      initClusterNode: true
  rollingUpgradeStatus:
    errorCount: 0
    lastSuccess: ''
  state: ClusterRollingUpgrading

Ok login seems to work if i point the ingress to a single node

erdrix commented 4 years ago

You encounter this error because you are using an ingress on top of the service. And when you do it, the loadbalancing on pods is handled by the ingress, instead of the service.

When you do the logging process, the whole process should be on the same node (you have a full explanation here : https://medium.com/swlh/operationalising-nifi-on-kubernetes-1a8e0ae16a6c). To do this, you need to enable session affinity on your ingress !

If you are using traeefik you just have to add this in your NifiCluster :

spec:
  ...
  service:
    ...
    annotations:
      traefik.ingress.kubernetes.io/affinity: "true"
Arttii commented 4 years ago

Thanks very much for your help, I got it working as desired. I will take a look at the sidecar PR.