stackabletech / nifi-operator

A kubernetes operator for Apache NiFi
Other
30 stars 5 forks source link

Document how to expose ports on NiFi nodes with Kubernetes Services #317

Closed razvan closed 11 months ago

razvan commented 2 years ago

Description

Nifi workflows can be triggered from custom REST endpoints. This article provides an example on how this works.

For clients to be able to access these endpoints from outside of the Kubernetes cluster they need to be exposed with service objects. This ticket contains a comment with an example on how this works further down.

We should initially add a short paragraph on this to our documentation to explain the fundamentals and give one or two brief examples.

At a later stage we can then investigate how to provide convenience functionality in our operator around this. I have opened https://github.com/stackabletech/nifi-operator/issues/414 as a follow up ticket for the investigation/implementation of this.

pipern commented 1 year ago

Is this concern (how do external systems send data to NiFi, such as a NiFi ListenTCP processor) wider than just 'REST'? I don't think it's 'REST triggered workflows' specific?

Is there an answer today on how the NiFi pod is configured to expose TCP/UDP ports for the 'listen' processors which someone might add in to their NiFi data flow design?

soenkeliebau commented 1 year ago

Hi @pipern - this is indeed not REST specific, but covers pretty much every processor in NiFi that might want to expose ports.

Currently this in not implemented, we have two ideas on this that we are looking to code when the time is right:

  1. Allow the user to specify ports that should be exposed in the NiFi cluster definition. So basically a hardcoded list provided by the user which we'll then translate into NodePorts (or similar things)
  2. An active part in the NiFi operator which uses the NiFi rest api to look for processors that should be exposed and dynamically creates NodePorts/ClusterIPs/... for these processors. This could take various forms, either allow specifying "all processors of type Rest should be exposed" or define a custom property that the user needs to add and which the operator then looks for or any number of different methods.

I do like the second method, because it is much more integrated and works straight from the NiFi UI, but it is also the much more complex one to build.

Ideally we'd like to integrate whatever approach we take with our Listerer operator, so that ListenerClasses can be used to define how ports should be exposed.

pipern commented 1 year ago

Thanks @soenkeliebau . Is there an intermediate solution which works today - does the NiFiCluster CRD have a place to declare additional pod.spec.containers[].ports? Otherwise it is correct that right now, no listeners are possible with Stackable NiFi? (I've glanced through the source code for nifi-operator/crd but don't spot a place to mention additional ContainerPorts)

soenkeliebau commented 1 year ago

Hi @pipern you can simply use normal Kubernetes functionality in the interim. Anything that we could add to the operator would simply be a convenience layer around k8s anyway.

To give a concrete example, I've stood up a three node NiFi on a GKE cluster and created the following flow:

image

with the ListenTCP processor listening on port 8123. Port 8123 lives only inside of the Kubernetes overlay network, so if I spin up a pod inside of Kubernetes I can do reach it via the internal ip:

bash-5.1# telnet 10.16.1.10 8123
Connected to 10.16.1.10
uiae
uiae
uiae
uiae
uiae
^C

If I want to have this port externally available I can use Service objects, for example a loadbalancer:

---
apiVersion: v1
kind: Service
metadata:
  name: nifi-lb
spec:
  type: LoadBalancer
  externalTrafficPolicy: Cluster
  selector:
    app.kubernetes.io/instance: simple-nifi
  ports:
    - name: tcp-port
      protocol: TCP
      port: 8080
      targetPort: 8123

Which exposes the internal port 8123 externally as 8080. So now on my laptop here in the office I can do the same thing, just with the external address of the loadbalancer:

(⎈ |gke_engineering-329019_us-central1-c_nifi:default)➜  ~ telnet 35.225.238.62 8080
Trying 35.225.238.62...
Connected to 35.225.238.62.
Escape character is '^]'.
123
321
^]

In NiFi we get seven messages, the five internal and the two external ones:

image

Any functionality that we'd add to the CRD to specify ports, or to the operator to poll for attributes etc. would under the hood then simply go and create objects just like the loadbalancer shown above. So it is all possible today, we'll just make it nicer in the future :)

pipern commented 1 year ago

Much appreciated @soenkeliebau . Might we rename this issue?

soenkeliebau commented 1 year ago

Sure thing, I would probably convert this to a docs ticket and open a development follow up ticket, does that sound good?

soenkeliebau commented 1 year ago

@pipern I have converted this issue to a docs issue and opened https://github.com/stackabletech/nifi-operator/issues/414 as a follow up implementation issue.