newrelic / micrometer-registry-newrelic

ARCHIVED. TO SEND MICROMETER METRICS TO NEW RELIC, FOLLOW THE DIRECTION IN THE README.md. Micrometer registry implementation that sends data to New Relic as dimensional metrics.
Apache License 2.0
35 stars 19 forks source link

v0.10.0 Error posting metrics to New Relic, cause: javax.net.ssl.SSLHandshakeException #145

Closed brownnei closed 2 years ago

brownnei commented 2 years ago

Since upgrading to v0.10.0 from 0.9.0 we are receiving the following error in our Kubernetes pods/applications logs:

2022-08-10T06:18:42.327498998Z stdout F [2022-08-10T06:18:42.327Z] [Thread-22] [WARN] [com.newrelic.telemetry.transport.BatchDataSender] IOException (message: Error posting metrics to New Relic, cause: javax.net.ssl.SSLHandshakeException: Remote host terminated the handshake) while trying to send data to New Relic. MetricBatch retry recommended

2022-08-10T06:18:42.327498998Z stdout F [2022-08-10T06:18:42.327Z] [Thread-22] [WARN] [com.newrelic.telemetry.transport.BatchDataSender] IOException (message: Error posting metrics to New Relic, cause: javax.net.ssl.SSLHandshakeException: Remote host terminated the handshake) while trying to send data to New Relic. MetricBatch retry recommended

We have confirmed that all New Relic endpoints have been whitelisted.

We tried passed a no ssl verify parameter to the agent but got same result.

Does anyone know how the metrics api end point URL is configured in this package? Is it correct?

(Migrate to Jira)

kford-newrelic commented 2 years ago

@brownnei can you confirm : if you revert back to 0.9.0 do the errors go away?

tmancill commented 2 years ago

In case this helps... I recently encountered a similar behavior with SSL handshake errors after upgrading from Java 11 to Java 17. For my use case, the error only manifests when using Conscrypt in FIPS mode. If I switch to BouncyCastle in FIPS mode or use the JDK-provided SSL, the error goes away. I don't see any difference in behavior between 0.9.0 and 0.10.0.

With Conscrypt on Java 17, the issue is:

INFO  [2022-08-26 18:42:05,004] com.newrelic.telemetry.TelemetryClient: [MetricBatch] - Batch sending failed. Backing off 0 MILLISECONDS
WARN  [2022-08-26 18:42:05,006] com.newrelic.telemetry.transport.BatchDataSender: IOException (message: Error posting metrics to New Relic) while trying to send data to New Relic. MetricBatch retry recommended

A colleague pointed out that the behavior with Java 17 could be related to the changes make to Socket in Java 16 for JEP-380.

kford-newrelic commented 2 years ago

@tmancill thanks for sharing this insight

For general awareness, how do you switch between using Conscrypt & BouncyCastle?

thomasveale commented 2 years ago

Not a java expert but reading some docs from oracle we see there are two ways to install a SPI. In this case, I assume we're targeting the JSSE (Java Secure Socket Extension, right @tmancill )?

The first and most simple being: Install the JAR file containing the provider classes as an installed or bundled extension

This means we add the dependency to our project via our build system and downloads the appropriate package from the published repos like maven.

Still testing this out on our end and there may be additional steps to enable the provider after it is installed.

The docs above mention editing the static file java.security file in your JRE. You can effectively accomplish this at runtime via the java security API.

import java.security.Provider; 
import java.security.Security; 
import java.util.Iterator;     
import java.util.Set;          

public class UseBouncyCastleJSSE {
    public static void main(String[] args) {
        // you should never place this before SUN/Oracle (positions 1 and 2) as per docs.
        Security.insertProviderAt(new org.bouncycastle.jsse.provider.BouncyCastleJsseProvider(true), 3);
        Provider[] providers = Security.getProviders();
        // loop over and print out the providers.
        for (int i = 0; i < providers.length; i++) { 
            Provider provider = providers[i];
            System.out.println("Provider name: " + provider.getName());
            System.out.println("Provider information: " + provider.getInfo());
            System.out.println("Provider version: " + provider.getVersion());
            Set entries = provider.entrySet();
            Iterator iterator = entries.iterator();
            while (iterator.hasNext()) {    
                System.out.println("Property entry: " + iterator.next());
            }
        }
    }
}

Example code shamelessly ripped from this comment

workato-integration[bot] commented 2 years ago

https://issues.newrelic.com/browse/NEWRELIC-4093

kford-newrelic commented 2 years ago

@brownnei we're going to assume that with the info above, you were able to resolve this issue. If this isn't correct, we'd like to know whether reverting back to 0.9.0 resolves the problem or not.