Closed jasonjkeller closed 5 months ago
Ran a simple test app that just makes a bunch of external calls using the Ning async HTTP client.
implementation 'com.ning:async-http-client:1.9.30'
implementation 'com.newrelic.agent.java:newrelic-api:8.11.0'
package org.example;
import com.newrelic.api.agent.NewRelic;
import com.newrelic.api.agent.Trace;
import com.newrelic.api.agent.Transaction;
import com.ning.http.client.AsyncHttpClient;
import com.ning.http.client.Response;
import java.util.concurrent.Future;
public class Main {
public static void main(String[] args) {
while (true) {
makeNingExternalCall();
}
}
// @Trace(dispatcher = true)
public static void makeNingExternalCall() {
try (AsyncHttpClient asyncHttpClient = new AsyncHttpClient()) {
Future<Response> f = asyncHttpClient.prepareGet("https://example.com/").execute();
Response r = f.get();
Transaction transaction = NewRelic.getAgent().getTransaction();
System.out.println("txn: " + transaction);
} catch (Exception ignored) {
}
}
}
When the external calls were made with no active transaction (using 8.11.0 agent), the old gen heap continuously grew indicating that the memory leak condition was present as hypothesized.
When the external calls were made with an active transaction (using 8.11.0 agent), the old gen heap exhibited a saw tooth pattern indicating that the memory was being reclaimed and that there was no memory leak occurring.
When the external calls were made with no active transaction (using 8.12.0-SNAPSHOT agent with memory leak fix), the old gen heap exhibited a saw tooth pattern indicating that the memory was being reclaimed and that there was no memory leak occurring.
A memory leak was reported by users of the Ning async HTTP client: https://github.com/newrelic/newrelic-java-agent/issues/783
The changes on the ning-fix branch were reported to resolve the memory leak.
The change was to simply get the
statusCode
andreasonMessage
from theresponseStatus
andnull
the reference to theresponseStatus
earlier, before we ever checked if thesegment
wasnull
or not.Given that the changes seem to fix the leak, it seems to indicate that there must be some condition where the
onThrowable
/onCompleted
ning methods are called and thesegment
isnull
.The only place where a value is assigned for the
segment
is in theAsyncHttpProvider#execute
weave class/method. It appears that the only scenarios where thesegment
would benull
is if there is no active transaction or if the URL from thecom.ning.http.client.Request
has ascheme
that is notnull
and does not equalhttp
orhttps
: https://github.com/newrelic/newrelic-java-agent/blob/edf5bea7e5ba0304a8804920e86cde5381c1b87c/instrumentation/ning-async-http-client-1.6.1/src/main/java/com/ning/http/client/AsyncHttpProvider.java#L35-L44The Ning async HTTP client instrumentation does not start its own transaction, so the most likely scenario is that the client is being called by uninstrumented code when the memory leak occurs. This would mean that each
com.ning.http.client.AsyncHandler
instance would have aNewField
weaved into it which would assign a reference to aHttpResponseStatus
every timeonStatusReceived
is called but the reference would never be freed since there is no activesegment
. If this is the only issue then it is effectively solved by the ning-fix changes. It should be easy to repro a scenario where the async http client is called but no transaction is created to see if that manifests a memory leak.That said, it might be worth investigating if there are other
scheme
types that should be supported. This seems unlikely, as we typically just support http/s, but perhaps there's a slightly different scheme identifier for http2.