newrelic / newrelic-unix-monitor

Monitoring service for Unix (AIX, Linux, HP-UX, MacOS, Solaris) systems
https://docs.newrelic.com/docs/infrastructure/host-integrations/host-integrations-list/unix-monitoring-integration/
Other
31 stars 24 forks source link

AIX Monitoring Suddenly Stopped #67

Open robertgambs opened 4 months ago

robertgambs commented 4 months ago

All of our servers stopped reporting to NR on the same day. Something in our environment changed, but I need some help hunting down the cause. I am new to NR. As for UnixMonitor, we installed it about two years ago and it has worked perfectly since until now. Our only clue is the following error we see in the NR log (plugin.err):

=============================================================================================
javax.net.ssl.SSLHandshakeException: com.ibm.jsse2.util.h: PKIX path building failed: java.security.cert.CertPathBuilderException: PKIXCertPathBuilderImpl could not build a valid CertPath.; internal cause is:
        java.security.cert.CertPathValidatorException: The certificate issued by O=<COMPANY NAME>, C=US is not trusted; internal cause is:
        java.security.cert.CertPathValidatorException: Certificate chaining error
        at com.ibm.jsse2.k.a(k.java:41)
        at com.ibm.jsse2.sc.a(sc.java:531)
        at com.ibm.jsse2.bb.a(bb.java:58)
        at com.ibm.jsse2.bb.a(bb.java:192)
        at com.ibm.jsse2.cb.a(cb.java:117)
        at com.ibm.jsse2.cb.a(cb.java:44)
        at com.ibm.jsse2.bb.t(bb.java:528)
        at com.ibm.jsse2.bb.a(bb.java:185)
        at com.ibm.jsse2.sc.a(sc.java:283)
        at com.ibm.jsse2.sc.h(sc.java:355)
        at com.ibm.jsse2.sc.a(sc.java:834)
        at com.ibm.jsse2.sc.startHandshake(sc.java:415)
        at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:396)
        at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.upgrade(DefaultHttpClientConnectionOperator.java:193)
        at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:389)
        at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:429)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
        at com.newrelic.insights.publish.InsightsClient.post(InsightsClient.java:91)
        at com.newrelic.infra.publish.insights.InsightsRunner$PollAgentsRunnable.run(InsightsRunner.java:212)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:483)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:316)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:190)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1164)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:634)
        at java.lang.Thread.run(Thread.java:798)
Caused by: com.ibm.jsse2.util.h: PKIX path building failed: java.security.cert.CertPathBuilderException: PKIXCertPathBuilderImpl could not build a valid CertPath.; internal cause is:
        java.security.cert.CertPathValidatorException: The certificate issued by O=<COMPANY NAME>, C=US is not trusted; internal cause is:
        java.security.cert.CertPathValidatorException: Certificate chaining error
        at com.ibm.jsse2.util.f.a(f.java:152)
        at com.ibm.jsse2.util.f.b(f.java:90)
        at com.ibm.jsse2.util.e.a(e.java:6)
        at com.ibm.jsse2.ad.a(ad.java:134)
        at com.ibm.jsse2.ad.a(ad.java:95)
        at com.ibm.jsse2.ad.checkServerTrusted(ad.java:135)
        at com.ibm.jsse2.cb.a(cb.java:6)
        ... 27 more
Caused by: java.security.cert.CertPathBuilderException: PKIXCertPathBuilderImpl could not build a valid CertPath.; internal cause is:
        java.security.cert.CertPathValidatorException: The certificate issued by O=<COMPANY NAME>, C=US is not trusted; internal cause is:
        java.security.cert.CertPathValidatorException: Certificate chaining error
        at com.ibm.security.cert.PKIXCertPathBuilderImpl.engineBuild(PKIXCertPathBuilderImpl.java:410)
        at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:256)
        at com.ibm.jsse2.util.f.a(f.java:144)
        ... 33 more
Caused by: java.security.cert.CertPathValidatorException: The certificate issued by O=<COMPANY NAME>, C=US is not trusted; internal cause is:
        java.security.cert.CertPathValidatorException: Certificate chaining error
        at com.ibm.security.cert.BasicChecker.<init>(BasicChecker.java:116)
        at com.ibm.security.cert.PKIXCertPathValidatorImpl.engineValidate(PKIXCertPathValidatorImpl.java:205)
        at com.ibm.security.cert.PKIXCertPathBuilderImpl.myValidator(PKIXCertPathBuilderImpl.java:737)
        at com.ibm.security.cert.PKIXCertPathBuilderImpl.buildCertPath(PKIXCertPathBuilderImpl.java:649)
        at com.ibm.security.cert.PKIXCertPathBuilderImpl.buildCertPath(PKIXCertPathBuilderImpl.java:595)
        at com.ibm.security.cert.PKIXCertPathBuilderImpl.engineBuild(PKIXCertPathBuilderImpl.java:356)
        ... 35 more
Caused by: java.security.cert.CertPathValidatorException: Certificate chaining error
        at com.ibm.security.cert.CertPathUtil.findIssuer(CertPathUtil.java:327)
        at com.ibm.security.cert.BasicChecker.<init>(BasicChecker.java:113)
        ... 40 more
=============================================================================================

I suspect this may have something to do with the certificate on my clients no longer being compatible with whatever server they are trying to connect to. What I don't know is where is this certificate and how would I update it? Or, it may be something else altogether. Being unfamiliar with NR, I am not sure how to go about troubleshooting this.

If anyone from the community has any suggestions, I'd appreciate it.

Robert

gsidhwani-nr commented 2 months ago

What is the JDK version you are using ?

robertgambs commented 2 months ago

Hi.. Here is the output from -version: java version "1.8.0_411" Java(TM) SE Runtime Environment (build 8.0.8.26 - pap6480sr8fp26-20240529_01(SR8 FP26)) IBM J9 VM (build 2.9, JRE 1.8.0 AIX ppc64-64-Bit Compressed References 20240521_71397 (JIT enabled, AOT enabled) OpenJ9 - 2a35f43 OMR - f3321fd IBM - a05ee94) JCL - 20240322_01 based on Oracle jdk8u411-b09

Thank you.

gsidhwani-nr commented 2 months ago

Please perform an SSL connection Test and let me know. The problem seems to be ROOT CA Certificate. Follow instructions here - https://github.com/gsidhwani-nr/utils-public