konpyutaika / nifikop

The NiFiKop NiFi Kubernetes operator makes it easy to run Apache NiFi on Kubernetes. Apache NiFI is a free, open-source solution that support powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
https://konpyutaika.github.io/nifikop/
Apache License 2.0
126 stars 44 forks source link

Simplenifi cluster is running but unaccessible #40

Closed juldrixx closed 2 years ago

juldrixx commented 2 years ago

From nifikop created by tmarkunin: Orange-OpenSource/nifikop#69

Bug Report

What did you do? I've installed simple nifi cluster following https://orange-opensource.github.io/nifikop/docs/2_setup/1_getting_started

What did you expect to see? Running nifi cluster with 2 nodes accessible through web UI

NAME READY STATUS RESTARTS AGE pod/nifikop-586867994d-lkmgc 1/1 Running 0 6h56m pod/nifikop-586867994d-pvnmn 0/1 Terminating 0 25h pod/simplenifi-1-nodew5925 1/1 Running 0 6h52m pod/simplenifi-2-nodegt8rh 1/1 Running 0 22h pod/zookeeper-0 1/1 Running 1 6h52m pod/zookeeper-1 1/1 Running 1 6h52m pod/zookeeper-2 1/1 Running 1 6h52m

What did you see instead? Under which circumstances? UI is not accessible through svc service/simplenifi-all-node. Moreover I failed to curl http:localhost:8080 from inside a container

$ curl http://localhost:8080/nifi curl: (7) Failed to connect to localhost port 8080: Connection refused

Environment

1.18

juldrixx commented 2 years ago

Can confirm the described bug with nifikop:v0.6.0-release.

Additionally, this is the log output for nifi:

Could not resolve host: simplenifi-headless.nifi-lab.svc.cluster.local
Expire in 4 ms for 1 (transfer 0x557c03967f50)
Closing connection 0
curl: (6) Could not resolve host: simplenifi-headless.nifi-lab.svc.cluster.local

Port-forward to 8080 results in connection refused as well. simplenifi-headless service exists.

juldrixx commented 2 years ago

Same issue here, unable to access to nifi 8080 port using port forward in release-0.6.3

nifi namespace pods:

❯ kubectl get po -n nifi
NAME                             READY   STATUS    RESTARTS   AGE
datos-0-nodeqbnwp                1/1     Running   0          22m
datos-2-nodess74q                1/1     Running   0          22m
green-nifikop-7877b7dc94-5btjr   1/1     Running   0          14h
green-zookeeper-0                1/1     Running   0          14h
green-zookeeper-1                1/1     Running   0          14h
green-zookeeper-2                1/1     Running   0          14h

Services:

❯ kubectl get svc -n nifi
NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                                   AGE
datos-clusterip            ClusterIP   10.100.213.62   <none>        8080/TCP                                                  22m
datos-headless             ClusterIP   None            <none>        8080/TCP,6007/TCP,10000/TCP,9020/TCP,10001/TCP,9090/TCP   22m
green-zookeeper            ClusterIP   10.100.17.181   <none>        2181/TCP,2888/TCP,3888/TCP                                4d20h
green-zookeeper-headless   ClusterIP   None            <none>        2181/TCP,2888/TCP,3888/TCP                                4d20h
green-zookeeper-metrics    ClusterIP   10.100.12.218   <none>        9141/TCP                                                  4d20h
nifikop-metrics            ClusterIP   10.100.87.7     <none>        9710/TCP                                                  4d21h

Service port-forward

❯ kubectl port-forward --namespace nifi service/datos-headless 8082:8080
Forwarding from 127.0.0.1:8082 -> 8080
Forwarding from [::1]:8082 -> 8080

❯ curl -IL http://localhost:8080 --insecure
curl: (7) Failed to connect to localhost port 8080: Connection refused

Pod port-forward

❯ kubectl get po -n nifi
NAME                             READY   STATUS    RESTARTS   AGE
datos-0-nodeqbnwp                1/1     Running   0          25m
datos-2-nodess74q                1/1     Running   0          25m
green-nifikop-7877b7dc94-5btjr   1/1     Running   0          14h
green-zookeeper-0                1/1     Running   0          14h
green-zookeeper-1                1/1     Running   0          14h
green-zookeeper-2                1/1     Running   0          14h

❯ kubectl port-forward --namespace nifi pod/datos-0-nodeqbnwp 8082:8080
Forwarding from 127.0.0.1:8082 -> 8080
Forwarding from [::1]:8082 -> 8080
Handling connection for 8082
E0823 15:27:21.527808   89707 portforward.go:400] an error occurred forwarding 8082 -> 8080: error forwarding port 8080 to pod 72a8e9c4fa537c1fff4a15d042af53c24473a7705fdfa0cdb79ac9cdcfb4be6d, uid : exit status 1: 2021/08/23 15:27:21 socat[30859] E connect(5, AF=2 127.0.0.1:8080, 16): Connection refused
juldrixx commented 2 years ago

Hi, there was an issue for clusters created with services in non-headless mode (when the spec.service.headlessEnabledfield is set to false). With the lastest release v0.5.2-release this should be fixed.

So you either you set service in headless mode or you migrate to the lastest version. Once done, just keep in mind that since version v0.5.0-release there is no more default external service (nodePort or loadbalancer service) created by the operator, you must declare them explicitly as we did in the example : https://github.com/Orange-OpenSource/nifikop/blob/master/config/samples/simplenificluster.yaml#L52. So if you want to access to your cluster, all you have to do is declare your service using the spec.externalServices field.

juldrixx commented 2 years ago

@erdrix I think the issue might be something different; I've encountered this as well & it seems Nifi is only listening on the external pod IP. When using .e.g kubectl port-forward it does not forward the port properly as it defaults to the containers localhost.

juldrixx commented 2 years ago

I'm experiencing this as well, any ideas or suggestions would be greatly appreciated @erdrix

Config:

apiVersion: nifi.orange.com/v1alpha1
kind: NifiCluster
metadata:
  name: simplenifi
spec:
  service:
    headlessEnabled: true
  zkAddress: "zookeeper.default.svc.cluster.local:2181"
  zkPath: "/simplenifi"
  clusterImage: "apache/nifi:1.12.1"
  oneNifiNodePerNode: false
  nodeConfigGroups:
    default_group:
      isNode: true
      storageConfigs:
        - mountPath: "/opt/nifi/nifi-current/logs"
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "examplestorageclass"
            resources:
              requests:
                storage: 10Gi
      serviceAccountName: "default"
      resourcesRequirements:
        limits:
          cpu: "0.5"
          memory: 2Gi
        requests:
          cpu: "0.5"
          memory: 2Gi
  nodes:
    - id: 1
      nodeConfigGroup: "default_group"
    - id: 2
      nodeConfigGroup: "default_group"
  propagateLabels: true
  readOnlyConfig:
    nifiProperties:
      overrideConfigs: |
        nifi.web.http.network.interface.default=eth0
        nifi.web.http.network.interface.lo=lo
      webProxyHosts:
        - clusterip:8080
        - localhost:8080
  nifiClusterTaskSpec:
    retryDurationMinutes: 10
  listenersConfig:
    internalListeners:
      - type: "http"
        name: "http"
        containerPort: 8080
      - type: "cluster"
        name: "cluster"
        containerPort: 6007
      - type: "s2s"
        name: "s2s"
        containerPort: 10000
      - type: "prometheus"
        name: "prometheus"
        containerPort: 9090
  externalServices:
    - name: "clusterip"
      spec:
        type: ClusterIP
        portConfigs:
          - port: 8080
            internalListenerName: "http"
      serviceAnnotations:
        toto: tata
    - name: "nodepart"
      spec:
        type: NodePort
        portConfigs:
          - port: 8080
            internalListenerName: "http"
      serviceAnnotations:
        toto: tata

I'm able to port-forward to access the UI on localhost but the error remains:

* Expire in 200 ms for 4 (transfer 0x55fda1c09f50)
* connect to 10.96.58.102 port 8080 failed: Connection refused
* Failed to connect to simplenifi-all-node.nifi.svc.cluster.local port 8080: Connection refused
* Closing connection 0
curl: (7) Failed to connect to simplenifi-all-node.nifi.svc.cluster.local port 8080: Connection refused

Thanks for your suggestion @esteban1983cl I tried your config and similarly could port forward successfully but still got this message in the pod:

* Could not resolve host: simplenifi-headless.nifi.svc.cluster.local
* Expire in 1 ms for 1 (transfer 0x5595b58a6f50)
* Closing connection 0
curl: (6) Could not resolve host: simplenifi-headless.nifi.svc.cluster.local
juldrixx commented 2 years ago

For future googler's:

You must configure the following:

I attach sample cluster and registry

apiVersion: nifi.orange.com/v1alpha1
kind: NifiRegistryClient
metadata:
  name: test-registry
  namespace: nifi
spec:
  # Contains the reference to the NifiCluster with the one the registry client is linked.
  clusterRef:
    name: test-cluster
    namespace: nifi
  # The Description of the Registry client.
  description: "Simple nifi registry demo"
  # The URI of the NiFi registry that should be used for pulling the flow.
  uri: "http://test-registry:18080"
---
apiVersion: nifi.orange.com/v1alpha1
kind: NifiCluster
metadata:
  name: test-cluster
  namespace: nifi
spec:
  service:
    headlessEnabled: true
  zkAddress: "green-zookeeper:2181"
  zkPath: "/test_cluster"
  clusterImage: "apache/nifi:1.12.1"
  oneNifiNodePerNode: false
  managedAdminUsers:
    - identity: "esteban.avendano@cencosud.cl"
      name: "eavendanoa"
  nodeConfigGroups:
    default_group:
      isNode: true
      provenanceStorage: "10 GB"
      runAsUser: 1000
      imagePullPolicy: IfNotPresent
      storageConfigs:
        # Name of the storage config, used to name PV to reuse into sidecars for example.
        - name: provenance-repository
          # Path where the volume will be mount into the main nifi container inside the pod.
          mountPath: "/opt/nifi/provenance_repository"
          # Kubernetes PVC spec
          # https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolumeclaim
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "gp2"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/nifi-current/logs"
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "gp2"
            resources:
              requests:
                storage: 10Gi
      serviceAccountName: "nifi"
      resourcesRequirements:
        limits:
          cpu: "1"
          memory: 4Gi
        requests:
          cpu: "1"
          memory: 3Gi
      nodeSelector:
        node.kubernetes.io/role: batch
  nodes:
    - id: 0
      nodeConfigGroup: "default_group"
    - id: 1
      nodeConfigGroup: "default_group"
  propagateLabels: true
  nifiClusterTaskSpec:
    retryDurationMinutes: 10
  readOnlyConfig:
    maximumTimerDrivenThreadCount: 30
    logbackConfig:
      replaceConfigMap:
        data: logback.xml
        name: test-cluster-logback-config
        namespace: nifi
    bootstrapProperties:
      nifiJvmMemory: "2g"
    nifiProperties:
      overrideConfigs: |
        nifi.web.http.network.interface.default=eth0
        nifi.web.http.network.interface.lo=lo
      webProxyHosts:
        - test-clusterip:8080
        - localhost:8080
  listenersConfig:
    internalListeners:
      - type: "http"
        name: "test-http"
        containerPort: 8080
      - type: "cluster"
        name: "test-cluster"
        containerPort: 6007
      - type: "s2s"
        name: "test-s2s"
        containerPort: 10000
      - type: "prometheus"
        name: "test-prometh"
        containerPort: 9090
  externalServices:
    - name: "test-clusterip"
      spec:
        type: ClusterIP
        portConfigs:
          - port: 8080
            internalListenerName: "test-http"
      serviceAnnotations:
        toto: tata
  sidecarConfigs:
    - name: app-log
      image: "busybox:1.32.0"
      args: [ tail, -n+1, -F, /var/log/nifi-app.log ]
      resources: &log_resources
        requests:
          cpu: 50m
          memory: 50Mi
        limits:
          cpu: 50m
          memory: 50Mi
      volumeMounts:
        - name: logs
          mountPath: /var/log
    - name: bootstrap-log
      image: "busybox:1.32.0"
      args: [ tail, -n+1, -F, /var/log/nifi-bootstrap.log ]
      resources: *log_resources
      volumeMounts:
        - name: logs
          mountPath: /var/log
    - name: user-log
      image: "busybox:1.32.0"
      args: [ tail, -n+1, -F, /var/log/nifi-user.log ]
      resources: *log_resources
      volumeMounts:
        - name: logs
          mountPath: /var/log

Next steps add security, ldap and ingress. I hope it helps.

juldrixx commented 2 years ago

I have the same problem `

apiVersion: nifi.orange.com/v1alpha1
kind: NifiCluster
metadata:
  name: sslnifi
  namespace: apache-nifi
spec:
  service:
    headlessEnabled: true
  zkAddress: "nifi-zookeeper:2181"
  zkPath: "/ssllnifi"
  clusterImage: "apache/nifi:1.12.1"
  initContainerImage: "busybox"
  oneNifiNodePerNode: false
  propagateLabels: true
  nifiClusterTaskSpec:
    retryDurationMinutes: 10
  readOnlyConfig:
    nifiProperties:
      webProxyHosts:
        - nifi-infra.test:8443
      overrideConfigs: |
        nifi.security.user.oidc.discovery.url=*****
        nifi.security.user.oidc.client.id=****
        nifi.security.user.oidc.client.secret=*****
        nifi.security.identity.mapping.pattern.dn=CN=([^,]*)(?:, (?:O|OU)=.*)?
        nifi.security.identity.mapping.value.dn=$1
        nifi.security.identity.mapping.transform.dn=NONE
  nodeConfigGroups:
    default_group:
      isNode: true
      storageConfigs:
        - mountPath: "/opt/nifi/nifi-current/logs"
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "nifi-sc"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/data"
          name: data
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "nifi-sc"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/flowfile_repository"
          name: flowfile-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "nifi-sc"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/nifi-current/conf"
          name: conf
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "nifi-sc"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/content_repository"
          name: content-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "nifi-sc"
            resources:
              requests:
                storage: 10Gi
        - mountPath: "/opt/nifi/provenance_repository"
          name: provenance-repository
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "nifi-sc"
            resources:
              requests:
                storage: 10Gi
      serviceAccountName: "default"
      nodeSelector: 
        node-purpose: infra
      resourcesRequirements:
        limits:
          cpu: "2"
          memory: 3Gi
        requests:
          cpu: "1"
          memory: 1Gi
  nodes:
    - id: 0
      nodeConfigGroup: "default_group"
    - id: 1
      nodeConfigGroup: "default_group"
    - id: 2
      nodeConfigGroup: "default_group"
  listenersConfig:
    internalListeners:
      - type: "https"
        name: "https"
        containerPort: 8443
      - type: "cluster"
        name: "cluster"
        containerPort: 6007
      - type: "s2s"
        name: "s2s"
        containerPort: 10000
    sslSecrets:
      tlsSecretName: "tls-nifi"
      create: false

`

apache-nifi sslnifi-headless ClusterIP None <none> 8443/TCP,6007/TCP,10000/TCP 49m

juldrixx commented 2 years ago

thank you @esteban1983cl for your answer, I am a case quite similar to yours and yet I still have the same concerns,

here is my conf:


apiVersion: nifi.orange.com/v1alpha1

kind: NifiCluster
metadata:
  name: simplenifi
spec:
  service:
    headlessEnabled: true
  zkAddress: zookeeper:2181
  zkPath: "/simplenifi"
  clusterImage: "apache/nifi:1.12.1"
  oneNifiNodePerNode: false
  nodeConfigGroups:
    default_group:
      isNode: true
      storageConfigs:
        - mountPath: "/opt/nifi/nifi-current/logs"
          name: logs
          pvcSpec:
            accessModes:
              - ReadWriteOnce
            storageClassName: "hostpath"
            resources:
              requests:
                storage: 10Gi
      serviceAccountName: "default"
      resourcesRequirements:
        limits:
          cpu: "0.5"
          memory: 2Gi
        requests:
          cpu: "0.5"
          memory: 2Gi
  nodes:
    - id: 1
      nodeConfigGroup: "default_group"
    - id: 2
      nodeConfigGroup: "default_group"
  propagateLabels: true
  nifiClusterTaskSpec:
    retryDurationMinutes: 10
  readOnlyConfig:
    maximumTimerDrivenThreadCount: 30
    logbackConfig:
      replaceConfigMap:
        data: logback.xml
        name: test-cluster-logback-config
        namespace: nifi
    bootstrapProperties:
      nifiJvmMemory: "2g"
    nifiProperties:
      overrideConfigs: |
        nifi.web.http.network.interface.default=eth0
        nifi.web.http.network.interface.lo=lo
      webProxyHosts:
        - clusterip:8080
        - localhost:8080
  listenersConfig:
    internalListeners:
      - type: "http"
        name: "http"
        containerPort: 8080
      - type: "cluster"
        name: "cluster"
        containerPort: 6007
      - type: "s2s"
        name: "s2s"
        containerPort: 10000
      - type: "prometheus"
        name: "prometheus"
        containerPort: 9090
  externalServices:
    - name: "clusterip"
      spec:
        type: ClusterIP
        portConfigs:
          - port: 8080
            internalListenerName: "http"
      serviceAnnotations:
        toto: tata
    - name: "loadbalancer"
      spec:
        type: LoadBalancer
        portConfigs:
          - port: 8080
            internalListenerName: "http"
      serviceAnnotations:
        toto: tata
    - name: "nodepart"
      spec:
        type: NodePort
        portConfigs:
          - port: 8080
            internalListenerName: "http"
      serviceAnnotations:
        toto: tata

`

and when I start the cluster:

kubectl -n nifi get pods
NAME                       READY   STATUS             RESTARTS   AGE
nifikop-79bb46b754-9trpg   1/1     Running            4          3h12m
simplenifi-1-nodetxzjg     1/1     Running            0          25m
simplenifi-2-nodetmvj5     1/1     Running            0          23m
zookeeper-0                1/1     Running            31         3h28m
zookeeper-1                1/1     Running            35         3h28m
zookeeper-2                1/1     Running            33         3h28m`

kubectl describe pods -n nifi simplenifi-1-nodetxzjg

*   Trying 10.1.0.57...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x5623b6c9bf50)
* connect to 10.1.0.57 port 8080 failed: Connection refused
* Failed to connect to simplenifi-1-node.simplenifi-headless.nifi.svc.cluster.local port 8080: Connection refused
* Closing connection 0
curl: (7) Failed to connect to simplenifi-1-node.simplenifi-headless.nifi.svc.cluster.local port 8080: Connection refused

I can't find the source of the problem, thank you for your help :)
pashtet04 commented 1 year ago

I am still experiencing same issue

Pod logs:

+ kubectl logs nifi-cluster-1-noded76nr
Defaulted container "nifi" out of: nifi, zookeeper (init)
Waiting for host to be reachable
failed to reach nifi-cluster-1-node.nifi-cluster-headless.nifi.svc.cluster.local:8080
Found: , expecting: 172.20.44.117

Nifikop Operator logs:

{"level":"info","time":"2023-06-07T09:23:32.461Z","logger":"controllers.NifiClusterTask","caller":"controllers/controller_common.go:34","msg":"nifi cluster communication error: could not connect to nifi nodes: nifi-cluster-headless.nifi.svc.cluster.local:8080: Get \"http://nifi-cluster-headless.nifi.svc.cluster.local:8080/nifi-api/controller/cluster\": dial tcp: lookup nifi-cluster-headless.nifi.svc.cluster.local on 172.24.10.2:53: no such host"}

Debug container with nslookup:

~ $ nslookup nifi-cluster-ip
Server:         172.24.10.2
Address:        172.24.10.2:53

Name:   nifi-cluster-ip.nifi.svc.cluster.local
Address: 172.24.11.214

** server can't find nifi-cluster-ip.svc.cluster.local: NXDOMAIN
** server can't find nifi-cluster-ip.svc.cluster.local: NXDOMAIN
** server can't find nifi-cluster-ip.cluster.local: NXDOMAIN
** server can't find nifi-cluster-ip.cluster.local: NXDOMAIN

After some restarts Pod stucks in Init state and I've seen error in events Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[default-token-4s94s data node-config node-tmp]: timed out waiting for the condition but I suppose this is related to often restarts and ReadWriteOnce PVC lock.