microsoft / ManagedEsent

MIT License
242 stars 36 forks source link

Esent & VirtualAlloc Calls - Memory Leak? #46

Closed robertmuehsig closed 3 years ago

robertmuehsig commented 3 years ago

We are using the ManagedEsent lib and therefor Esent in our desktop application. It is used as a Cache to store all kind of data (e.g. user data like names or binary stuff like images and larger binary stuff like word documents etc. The library served us well for the last couple of years. Recently 2 customers noted that some clients are demanding more and more memory. Both customers use Citrix and our first thought was, that it must be something wrong with Citrix. Unfortunately the issue can't be easily reproduced - it is not always the same user and we couldn't find a pattern.

After hours of investigation we finally could get a Windows Performance .etl file and it seems that esent claims more and more memory: image

Line #, Process, Commit Stack, Commit Time (s), Decommit Time (s), Address, Count, Size (MB)
2, , [Root]/ntdll.dll!RtlUserThreadStart/kernel32.dll!BaseThreadInitThunk/ntdll.dll!TppWorkerThread/ntdll.dll!TppTimerpExecuteCallback/ntdll.dll!RtlpTpTimerCallback/esent.dll!COSTimerQueueEntry::Completion_/esent.dll!BFIMaintCacheSizeITask/esent.dll!ErrBFICacheGrow/esent.dll!ErrBFICacheISetSize/esent.dll!ErrBFICacheISetDataSize/esent.dll!OSSYNC::FPageCommit/KernelBase.dll!VirtualAlloc/hmpalert.dll!<PDB not found>/ntdll.dll!NtAllocateVirtualMemory/ntoskrnl.exe!KiSystemServiceCopyEnd/ntoskrnl.exe!NtAllocateVirtualMemory/ntoskrnl.exe!MiAllocateVirtualMemory, 0.885232700, 1.683456200, 0x0000028B9C2D0000, 1, 512.000

The CPU is also quite busy with those calls:

image

Be aware, that this is a server with "lots" of memory, but one client that claims all memory is problematic.

Question: Do I read the etl file correct and that "esent" seems to be the root cause of the memory allocation? What does "BFIMaintCacheSizeITask" and "ErrBFICacheISetSize" do? I couldn't find anything and maybe someone from the ManagedEsent team knows more. I found a property CacheSize (Docs), but I'm not 100% sure if this would solve the problem or how to apply it - I can't find in the Microsoft.Isam.Esent.Interop.Instance - InstanceParameters-Property.

machish commented 3 years ago

The BF functions have to deal with the buffer cache. Yes, you'll probably want to set CacheSizeMax. https://docs.microsoft.com/en-us/windows/win32/extensible-storage-engine/systemparameters.cachesizemax-property

Basically: The database cache starts at zero, and grows very aggressively up to CacheSizeMin. Once it's there, it can grow up to CacheSizeMax if there's available memory on the system. It should stay between CacheSizeMin and CacheSizeMax. It is supposed to back off when another process needs it more.

The cache-sizing algorithm uses LRU-K. If the database is small, then there's no need for it to grow very large. If the database IS large, but you only ever access the same few rows of the same few tables, then it will not grow very large, either.

So your process is likely accessing large portions of the database, and it's likely being done on a machine with lots of available memory. Just because it's using a lot of memory does not mean that it's a leak. It should shrink if other programs start to use memory. But to be safe, setting the min/max cache sizes can make it more predictable. Are you seeing negative performance on the system, or was it just noticed that it had a lot of memory allocated?

You can use the Database counters in perfmon to see the current Database Cache Size (MB) image

Hope that helps.

robertmuehsig commented 3 years ago

Are you seeing negative performance on the system, or was it just noticed that it had a lot of memory allocated?

The IT admins were not very happy that a process claims ~20-40GB memory ("commit size", not active working set) and those servers were unstable at this point. I'm not really sure why or how some process gets this big, because our data inside the esent database is quite small (>500mb). The CacheSizeMax property seems a good fit - at least it would help, if we could set a hard limit. How do I apply the CacheSizeMax property via ManagedEsent? Is is not a part of the InstanceParameters

Thanks for your help!

machish commented 3 years ago

It's SystemParameters.CacheSize: https://github.com/microsoft/ManagedEsent/blob/master/EsentInterop/SystemParameters.cs

@MsftBrettShirley @michaelthorp is it a known issue that the database cache might grow well beyond the database size?

@robertmuehsig which OS version is it? WPA should be able to tell you.

robertmuehsig commented 3 years ago

@machish Thanks - didn't noticed the static SystemParameters class. The OS was Windows Server 2019.

robertmuehsig commented 3 years ago

Just to let you know: With the SystemParameters.CacheSize property we never had any issues again, so I would say this "fixed" our problem. I'm really not sure why Esent was consuming so much memory (commit size >40GB) while the system was already "starving", but it only occured on "larger" terminal server/citrix installations with other strange configurations.