Open prollin opened 4 years ago
Just a note, unlike the async version, the version which uses .Result
not sleep for 10 milliseconds on each run. Replacing that with Task.Delay
with Thread.Sleep(10)
changes the rate at which allocations are done.
Clearly there is something fishy, but might not be directly related to the handler, since it will just returns the task, nevertheless I'll do some investigation.
@mandel-macaque I did some more investigation and It looks like it is leaking memory allocated in SecCertificateCreateWithData
and SecCertificateCreateWithBytes
, both downstream from System.Mono.AppleTls.AppleTslContext.ProcessHandshake
@prollin any clear indication of a method to remove this or avoid this from happening?
Unfortunately I haven't. It seems to be an issue deeply rooted in iOS. My up to date guess is that the app is getting "charged" for the memory used by the SSL certificate cache managed by the OS. This cache will grow up to a certain size but it is possible that there is 1 cache per session (we do use a few httpclient in our app). Note that there is still quite a bit of speculation in my findings.
@mandel-macaque any chance on your end?
I'm taking a look atm, sorry I was side tracked by other issues. Will try to share some light ASAP.
The following are some technical details of the investigation which will later be added in the PR with the fix.
I have ran the sample application with the following configurations (images attached as well as the profiler files to prove the issue)
There is no memory leak as you can see in the profiler image (all profiling files are added at the end of the comment in a .zip
Same as with the managed implementation there is no leak.
There is a bigger memory leak in the sync implementation than there is in the async one.
Async profiled:
Sync profiled:
Conclusion: Issue confirmed.
The stack traces of the leak objects are the same, which means that clearly some threading was involved in the issue, on top of that @Therzok comment points to an issue with Tasks delays and threads. That comment already made be suspect of line https://github.com/xamarin/xamarin-macios/blob/master/src/Foundation/NSUrlSessionHandler.cs#L690
Once you see that line, and compare the memory that is leaked (it is a memory stream from the HttpContent) clearly points to the fact that the callback for the HACK is the one leaking memory. The lambda (in this case a closure since it captures the variable) is keeping the captured variable alive and the GC is not collecting it :/
The issue resides in the fact that we are using a boolean in the inflight data to identify if the response was set, and that is wrong. The tcs already handles this correctly and the Try* methods are safe. This allows the removal of the lock, which was the one making the TrySetResult block. There is no need to set the boolean since it can be removed and we can use the tcs methods to assert if the value was set.
The following are the results of profiling both implementations, as you can see, with the fix both the memory in the sync and the async versions of the app grow the same.
Sync profiled:
Async profiled
The fix is proposed but since @prollin has done such an amazing job I though it would be nice to add a nice explanation of what was going on :)
The following are the profiling sessions showing that the fix does what I claim
Broken:
Fix: Fix.zip
Other handlers: other.zip
PS: Thanks to @Therzok the Task.Delay to Thread.Sleep already gave me a good indication where to look.
@mandel-macaque this is an amazing investigation! Thank you.
What is interesting is that I did test with the different handlers initially and could still see memory usage growing (unfortunately I do not have access to Xamarin Profiler). I just re-tested my repro case with CFNetworkHandler and I still see memory going up (in activity monitor instrument) but at a much lower rate than with NSURLSessionHandler. This could just be the SSL cache I mentioned earlier.
The test with CFNetworkHandler was flawed since the server I tested all returned error status (505 or 400). Is CFNetworkHandler fully deprecated now?
@mandel-macaque any idea when this fix will be released? There seem to be no simple workaround to this issue in the meantime since the other handlers do not work.
@prollin as soon as the PR is approved and merged I will do a cherry-pick so that the fix is in d16-7.
The CFNetwork is not deprecated, if it is failing, can you please create an issue, AFAIK it should work. I wont be adding new features to CFNetwork but will fix bugs.
The smaller leak can be due to several things, one is that the HttpClient does keep connections open to reuse them, the other is that if you are using caching, there is a NSData that I cannot dispose, if you are not, it should be disposed (and is an enhancement for NSUrlSessionHandler that I'll be working on https://github.com/xamarin/xamarin-macios/issues/8558).
@prollin one thing, since NSUrlSessionHandler is opensource, you can always copy the fix and add it to your project ASAP.
@rolfbjarne brought to my attention that I uploaded the wrong profile for the broken version, here is the correct one:
@mandel-macaque quick update: since copying the fix directly opened another can of worms, I tried to use NSUrlSession API directly; I was still seeing memory usage growing significantly with a test similar to the one above:
Task.Run(() =>
{
while (true)
{
using (var url = new NSUrl("https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png"))
{
using (var dataTask = this.nsUrlSession.CreateDataTask(url, (data, response, e) => { sem.Release(); }))
{
dataTask.Resume();
sem.WaitOne();
}
}
}
});
Looking at Instrument, the leaked object were all iOS Core Foundation types like CFString
and CFHTTPMessage
prompting me to believe that it could be an autorelease pool issue.
After some research I stumbled on this old thread: https://forums.xamarin.com/discussion/6404/memory-leaks-and-nsautoreleasepool
that seem to indicate that API that cache tasks could leak.
Adding an NSAutoreleasePool
to the test above does indeed solves the issue:
Task.Run(() =>
{
while (true)
{
using (var autoreleasePool = new NSAutoreleasePool())
{
using (var url = new NSUrl("https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png"))
{
using (var dataTask = this.nsUrlSession.CreateDataTask(url, (data, response, e) => { sem.Release(); }))
{
dataTask.Resume();
sem.WaitOne();
}
}
}
}
});
After this I simply added the NSAutoreleasePool in our production code that was causing issues (no other changes) and ran it overnight: memory did grow a bit (probably from the NSUrlSessionHandler leak you fixed + other small issues we have) but never came close to what it was before.
@prollin So we were leaking in two diff places! Ok, I'll deal with that part too.
Calling
HttpClient.GetAsync(url).Result
in a loop causes process memory to grow continuously:Using async/await seem to solve the issue (but unless I missed something obvious, the synchronous version shouldn't leak):
Steps to Reproduce
Expected Behavior
Process memory doesn't increase continuously
Actual Behavior
Process memory increases continuously
Environment
Build Logs
build.log
Example Project (If Possible)
HttpClientMemoryLeak.zip