Open forry2 opened 2 years ago
First NiFi node correctly creates the cluster, but the second node keeps popping such messages:
2022-07-01 19:31:11,102 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:16,436 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:21,773 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:27,090 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:32,403 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:37,720 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:43,035 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:48,349 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:53,663 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:31:58,980 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:32:04,291 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 2022-07-01 19:32:09,607 WARN [Clustering Tasks Thread-1] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message 202
Hi @forry2 since NiFi 1.14.0 security is turned on by default (see https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.14.0) which means NiFi cluster nodes expect to mutually authenticate using TLS. The chart supports setting up the certificate authority and client certificates using either (A) the built-in NiFi Toolkit or (B) cert-manager (see https://cert-manager.io). Your values.yaml
file has neither of those options enabled.
@wknickless thank you so much for your reply. I turned ca.enabled to true, but no better luck, still plenty of "Failed marshalling 'HEARTBEAT' protocol message" messages there on the second node. What am I missing?
ca: '# If true, enable the nifi-toolkit certificate authority enabled: true persistence: enabled: true server: "" service: port: 9090 token: sixteenCharacters admin: cn: admin serviceAccount: create: false
openshift: scc: enabled: false
@forry2 unfortunately I don't know how that part of the chart works.
Does anybody know how to make the ca part of the chart work? @Subv and @alexnuttinck I saw you worked on this part of the code
Looks like either the 2nd instance is not presenting itself to the coordinator with the correct certificate name, or the 2nd instance's public key is not present in the truststore of the coordinator. How does the ca section of the chart work?
Hi no clue anybody? :( the chart is not working as it is :(
It came out that NiFi pod were actually generating certificates but not using them. We have patched the bash script that manages this part, but I'd like to get in touch with someone here who worked on that part to understand whether we didn't use the chart in the right way or if that's actually a bug
@forry2 thanks for debugging the problem!
It looks like @alexnuttinck and @makeacode added CA support in https://github.com/cetic/helm-nifi/pull/76 back in September 2020. The chart (generally) and the Helm templating is so complicated that I wrote a bunch of tests (see https://github.com/cetic/helm-nifi/tree/master/tests and https://github.com/cetic/helm-nifi/tree/master/.github/workflows) to ensure that as additions and changes are made they don't break anything. Unfortunately the CA support pre-dates that strategy, so we don't have any test coverage.
Would you be willing to share the changes you made to get it to work?
Also, would you be up to adapting one or more of the existing tests to cover your use case? For example:
...confirms that a 3-way NiFi cluster actually comes up with mutual connections and authentication.
It seems the cert-request initContainer was removed from the StatefulSet in #169 so the nodes all generate their own self-signed certificate and cannot form a cluster when using ca.enabled=true instead of CertManager
Does anybody know how to go on with that part of the chart?
Il sab 2 lug 2022, 00:09 Bill Nickless @.***> ha scritto:
@forry2 https://github.com/forry2 unfortunately I don't know how that part of the chart works.
— Reply to this email directly, view it on GitHub https://github.com/cetic/helm-nifi/issues/262#issuecomment-1172750615, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBFQ3RX6K6GD7OQQ3ON6CLVR5UB5ANCNFSM52NUMMZA . You are receiving this because you were mentioned.Message ID: @.***>
Hi, I tried my best to run a 2-instances cluster but the second NiFi instance does not join the cluster. app-log shows a HEARTBEAT marshal problem. Please find my values.yaml (renamed as values.txt) file attached to this issue.
Failed marshalling 'CONNECTION_REQUEST' protocol message due to: javax.net.ssl.SSLHandshakeException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchor values.txt