cetic / helm-nifi

Helm Chart for Apache Nifi
Apache License 2.0
215 stars 225 forks source link

[cetic/nifi] Node address of the nodes in the cluster #252

Closed Premfer closed 2 years ago

Premfer commented 2 years ago

Hi,

When we create nifi cluster with nodes, the node address of the node is in the format of service address like nifi-0.nifi-headless.cuttle-nifi.svc.cluster.local. I am trying to register a node from different system. this is for the hybrid model we are trying to achieve between on prem and cloud. On doing POC for it, i encountered the node is trying to send heart beat to the primary node. but this address cannot be resolved. Is there a way we can load balance it.

Thanks, Deva

wknickless commented 2 years ago

@Premfer all your NiFi nodes must keep their Kubernetes cluster internal hostnames and DNS names. To provide external access for users and site-to-site connections you'll use a Kubernetes Ingress. See https://github.com/wknickless/helm-nifi/blob/feature/cert-manager/tests/07-oidc-cluster-values.yaml for an example of how that's configured.

(This example comes from Pull Request #218 which hasn't yet been merged.)

Premfer commented 2 years ago

Hi @wknickless Thanks for the reply.

Can you please elaborate how we change ingress for a single node. I am accessing the UI through istio ingress gateway. can you please let me know i can u use that here.

Also the node dns name is sent by the zookeeper i guess, so when node on on prem is trying to send heartbeat to the primary node here.

Thanks, Deva

wknickless commented 2 years ago

@Premfer NiFi clusters seem to work best when all the NiFi nodes are provisioned identically, with open connectivity between them. We recently had a situation where some of our NiFi pods were running on Kubernetes nodes with older CPUs, and the overall NiFi performance was compromised because some of the NiFi nodes weren't processing data at the same rate. When you configure a NiFi processor you select how many threads of that processor will run on every NiFi node. For older CPUs you might want more threads, but NiFi doesn't allow you to tune the number of NiFi processor threads on a per-NiFi node basis.

That problem is even more pronounced when combining NiFi workflows across on-premise and cloud service provider servers. Were I trying to establish a hybrid NiFi workflow with both on-premise and cloud service provider components, I would set up two NiFi clusters--one on premise and one in the cloud service provider. Then I would follow the recommendations at (e.g.) https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.3.1/nifi-system-properties/content/site-to-site-and-reverse-proxy-examples.html to set up the reverse proxies on both sides to allow site-to-site communication between my on-site and cloud-service-provider NiFi clusters.