newrelic / newrelic-java-agent

The New Relic Java agent
Apache License 2.0
202 stars 144 forks source link

Http request calls failing with agent 8.0.0 #1142

Closed jfourment closed 1 year ago

jfourment commented 1 year ago

Description

Functionalities relying on HTTP calls started failing with the new version of the NewRelic Java agent across multiple apps. Reverting to 7.9.x resolve those issues.

Maybe related to latest change to HttpURLConnection instrumentation (as seen in release notes)

Expected Behavior

Instrumentation to not break Http calls :)

[Troubleshooting]

All failures in the code show one these 2 exceptions:

java.net.ProtocolException: Cannot write output after reading input.
    at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1338)
    at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1317)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:264)
Stream is closed; nested exception is java.io.IOException: Stream is closed
    at org.springframework.security.oauth2.client.oidc.authentication.OidcAuthorizationCodeAuthenticationProvider.getResponse(OidcAuthorizationCodeAuthenticationProvider.java:176)
    at org.springframework.security.oauth2.client.oidc.authentication.OidcAuthorizationCodeAuthenticationProvider.authenticate(OidcAuthorizationCodeAuthenticationProvider.java:144)

Your Environment

Failures have been recorded on different environments running either on temurin-jdk8 or correto-jdk8, and a range of Spring version (I can elaborate if needed). Issues only appear with latest version of the NewRelic Agent v8.0.0

Additional context

We have seen ‘Content-Length’ header disappear from certain outgoing HTTP requests from our applications post the update as well, which has resulted in external servers rejecting our requests with HTTP 411 errors. Reverting the version to 7.x.x resolved this.

workato-integration[bot] commented 1 year ago

https://issues.newrelic.com/browse/NEWRELIC-6603

felizi commented 1 year ago

Same problem here with all Feign integrations. The adopted solution was change the default library of Feign to use Apache HC5 (spring.cloud.openfeign.httpclient.hc5.enabled): https://docs.spring.io/spring-cloud-openfeign/docs/current/reference/html/#spring-cloud-feign-overriding-defaults

jasonjkeller commented 1 year ago

@jfourment Any chance you could provide a code snippet showing how exactly you are making use of the HttpURLConnection APIs when the exception occurs?

workato-integration[bot] commented 1 year ago

Jira CommentId: 146922 Commented by jkeller:

Related: https://forum.newrelic.com/s/hubtopic/aAX8W0000008dhfWAA/java-agent-800-braintree-call-failure

jfourment commented 1 year ago

Hi @jasonjkeller

HttpURLConnection is used indirectly, we observed errors with the 2 OAuth libraries we're using.

com.github.scribejava-core 6.9.0

java.net.ProtocolException: Cannot write output after reading input. at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1338) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1317) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:264) at com.github.scribejava.core.httpclient.jdk.JDKHttpClient.prepareConnectionForBodyAndGetOutputStream(JDKHttpClient.java:269) at com.github.scribejava.core.httpclient.jdk.JDKHttpClient.addBody(JDKHttpClient.java:195) at com.github.scribejava.core.httpclient.jdk.JDKHttpClient.access$100(JDKHttpClient.java:26) at com.github.scribejava.core.httpclient.jdk.JDKHttpClient$BodyType$1.setBody(JDKHttpClient.java:147) at com.github.scribejava.core.httpclient.jdk.JDKHttpClient.doExecute(JDKHttpClient.java:129) at com.github.scribejava.core.httpclient.jdk.JDKHttpClient.execute(JDKHttpClient.java:95) at com.github.scribejava.core.oauth.OAuthService.execute(OAuthService.java:114) at com.github.scribejava.core.oauth.OAuth20Service.sendAccessTokenRequestSync(OAuth20Service.java:46) at com.github.scribejava.core.oauth.OAuth20Service.refreshAccessToken(OAuth20Service.java:160) ...

spring-security-oauth2-client 5.7.5

org.springframework.security.oauth2.core.OAuth2AuthenticationException: [invalid_token_response] An error occurred while attempting to retrieve the OAuth 2.0 Access Token Response: I/O error on POST request for "https://xxx.xxx.com/oauth2/v1/token": Stream is closed; nested exception is java.io.IOException: Stream is closed at org.springframework.security.oauth2.client.oidc.authentication.OidcAuthorizationCodeAuthenticationProvider.getResponse(OidcAuthorizationCodeAuthenticationProvider.java:176) at org.springframework.security.oauth2.client.oidc.authentication.OidcAuthorizationCodeAuthenticationProvider.authenticate(OidcAuthorizationCodeAuthenticationProvider.java:144) at org.springframework.security.authentication.ProviderManager.authenticate(ProviderManager.java:182) at org.springframework.security.oauth2.client.web.OAuth2LoginAuthenticationFilter.attemptAuthentication(OAuth2LoginAuthenticationFilter.java:192) at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:222) at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:212) at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336) at org.springframework.security.oauth2.client.web.OAuth2AuthorizationRequestRedirectFilter.doFilterInternal(OAuth2AuthorizationRequestRedirectFilter.java:178)

Hope that helps.

claitonmarcilio commented 1 year ago

Hi @jasonjkeller ! Using Feign I am able to simulate the issue with the following code (with the agent properly configured):

@RunWith(JUnit4.class)
public class FeignClientTest {

    interface EchoAPI {

        @RequestLine("POST /post")
        @Headers({
            "Accept: application/json",
            "Content-Type: application/json"
        })
        String echo(String body);
    }

    @Test
    @Trace(dispatcher = true)
    public void testEchoAPI() {
        EchoAPI api = Feign.builder().target(EchoAPI.class, "https://postman-echo.com");
        String response = api.echo("response body will be this");
        assert response.contains("response body will be this");
    }
}
jasonjkeller commented 1 year ago

Thanks for the additional info @jfourment @claitonmarcilio ! I'll dig into it.

jasonjkeller commented 1 year ago

@claitonmarcilio I stuck the example code that you provided into a simple Java app, attached the 8.0.0 Java agent, and it didn't seem to repro the issue. I tried with spring-cloud-starter-openfeign:3.1.3 and spring-cloud-starter-openfeign:4.0.1 as well as Java 8/11/17 and in all cases it worked as expected and resulted in the external calls being recorded properly and the Content-Length header was also present.

Got response: {"args":{},"data":"response body will be this","files":{},"form":{},"headers":{"x-forwarded-proto":"https","x-forwarded-port":"443","host":"postman-echo.com","x-amzn-trace-id":"Root=1-63e59a03-1589ea1a123abc","content-length":"26","accept":"application/json","content-type":"application/json","tracestate":"310705@nr=0-0-998879-37949771-4c2572c616c793b0-123abc-1-1.718596-1675991555833","traceparent":"00-123abc-4c2572c616c793b0-01","newrelic":"eyJkIjp7ImFjIjoiOTk4ODc123abcTc3MSJ9LCJ2IjpbMCwxXX0=","user-agent":"Java/17.0.3"},"json":null,"url":"https://postman-echo.com/post"}
Screen Shot 2023-02-09 at 5 17 29 PM

Is there some other factor in your repro that I'm missing? What version of spring-cloud-starter-openfeign are you using when the issue is repro'd? What JDK vendor/version? Any special config that triggers it? If you're able to just a link to a repo (or a zip file) of a project that can be run that would be super helpful.

Thanks!

claitonmarcilio commented 1 year ago

Hi @jasonjkeller! Sure, I've created this repo where you can find that example. I also configured a GitHub Actions workflow that is testing two versions, you can see the results here.

jasonjkeller commented 1 year ago

Thanks a ton for that @claitonmarcilio ! Much appreciated!

I see now what the difference is in our individual tests that causes it to fail for you but pass for me. In my environment I have the agent configured to use distributed tracing, whereas your environment has the agent configured to use the legacy cross application tracing with DT disabled.

If you enable distributed tracing as follows the issue shouldn't be present. I'll dig into why legacy CAT was broken by the instrumentation changes.

  # ============================ DISTRIBUTED TRACING ===========================
  distributed_tracing:
    enabled: true
    exclude_newrelic_header: false

  # ============================ CAT ==========================================
  cross_application_tracer:
    enabled: false
jfourment commented 1 year ago

Hi, thanks @claitonmarcilio , it looks like you're onto something @jasonjkeller . I can also confirm we have legacy CAT enabled in our applications (and distributed_tracing disabled)

jasonjkeller commented 1 year ago

@claitonmarcilio @jfourment I believe I've got the issue fixed. There's a custom agent jar here if you want to test it out, otherwise we're hoping to get a bug fix release out soon.

Thanks again for all of your help!

workato-integration[bot] commented 1 year ago

Jira CommentId: 155202 Commented by jkeller:

Issues should be fixed by the following PR and released in Java agent 8.0.1: https://github.com/newrelic/newrelic-java-agent/pull/1145

jasonjkeller commented 1 year ago

The Java agent 8.0.1 release is out now: https://docs.newrelic.com/docs/release-notes/agent-release-notes/java-release-notes/java-agent-801/