fsprojects / FSharp.TypeProviders.SDK

The SDK for creating F# type providers
https://fsprojects.github.io/FSharp.TypeProviders.SDK/
MIT License
298 stars 94 forks source link

[WIP] Added strong cache for ILModuleReaders #283

Closed TIHan closed 5 years ago

TIHan commented 5 years ago

This adds a strong cache for ILModuleReaders instead of a weak cache. While we do a File.ReadAllBytes and it goes on the LOH, they will be long lived so it's appropriate. We do include a last write time to invalidate the cache just in case a user changes it; but they do not change frequently.

TIHan commented 5 years ago

@dsyme this isn't quite complete because now this is effectively a memory leak. We need to figure out when we can clear it out and have some sort of life-cycle for it.

dsyme commented 5 years ago

@dsyme this isn't quite complete because now this is effectively a memory leak. We need to figure out when we can clear it out and have some sort of life-cycle for it.

Yup. But looking good :)

dsyme commented 5 years ago

@TIHan I resolved the merge conflict for you (I hope I didn't make a mistake)

blumu commented 5 years ago

So here is the result of my testing with a private build of FSharp.Data provided by @TIHan based on this fix:

davkean commented 5 years ago

I also looked at the trace - the trace is too long (3 minutes), which meant that we threw out events. Traces need to be less than 30 seconds.

@blumu Can you turn up Diagnostic data to "full" in the Diagnostic & Feedback windows setting page? Your machine isn't sending any performance/watson data so cannot see any past crashes, UI or typing delays.

davkean commented 5 years ago

Nevermind, I found some data through other means, just digging through it.

davkean commented 5 years ago

@TIHan We don't need a trace to see GC's, if you look at the vs/perf/clrpausemsecs event in the session, it will show GC sizes/times. We will send an event for any GC over 500ms. I don't yet have results for the above session, but I looked at past sessions and GC time, in particular, Gen 2, is absolutely almost all the cause of your delays. The Gen2 + Large Object Heap are all huge, resulting in GC after GC after GC, all causing large delays of upwards of 2 seconds each.

When the results are in for the above session in a day, then @TIHan should be able to see if it's still GC that is causing the issue.

dsyme commented 5 years ago

@TIHan We discussed using MemoryCache to allow resources to be reclaimed via a sliding window.

Unfortunately System.Runtime.Caching is not in .NET 4.5 nor .NET Standard 2.0, and TP design time (TPDTC) components are nearly always now compiled as one or both of those, to allow them to deploy into most active tooling.

So I will just accept this PR. Additionally I have added an off-by-default resource reclaimer that clears the reader table every minute when type providers are in active use. TO enable it set environment variable

FSHARP_TPREADER_AUTOCLEAR

to any non-empty value. It is likely the fixes we already have will be enough to reduce memory usage sufficiently

davkean commented 5 years ago

A cache without an invalidation policy is a memory leak. Visual Studio spans solutions loads, project loads, branch switches all of which could result this in cache growing unbounded.

cartermp commented 5 years ago

The TPSDK should no longer support net45 if we cannot have a proper solution here.

davkean commented 5 years ago

I don't think a sliding window is a great approach either, it should be tied to the life of something; project data, or TP themselves.

blumu commented 5 years ago

I tested the fix of FSharp.Data (3.0.1) provided by @dsyme based on this version of FSTPSDK and I can confirm that the problem went away. See https://github.com/fsharp/FSharp.Data/issues/1249 for details

@davkean I am not sure if it's relevant, but for us the issue reproes systematically right after launching VS and calling "Find all references" for the first time. Which suggests that caching would help even within the scope of a single build. In other words, even if you were to invalidate the cache on every build that would still resolve the perf issue for us. I can see how this would break incremental builds though. So perhaps there is a better way to scope the cache as you suggested.

dsyme commented 5 years ago

A cache without an invalidation policy is a memory leak. Visual Studio spans solutions loads, project loads, branch switches all of which could result this in cache growing unbounded.

Yes, we should clear here. For example we could implement an expiration policy by starting a background async to clear the cache under some policy. Switching to System.Runtime.MemoryCache, is not feasible as TP components really need to be netstandard2.0 realistically and it's not available there AFAICS.

I don't think a sliding window is a great approach either, it should be tied to the life of something; project data, or TP themselves.

The problem is we need to artificially extend the lifetime of these objects so they are shared between TP instances. We don't have any larger handle/scope for which to share them

dsyme commented 5 years ago

@davkean I'll arrange for a better fix here. BTW I notice FSharp.Data itself has a bunch of caches too. It would be so good if we could truly load/unload components in isolation...

dsyme commented 5 years ago

@davkean Please check out #295 to see if you think it meets your requirements/expectations as a tactical fix