Gw2Sharp in-memory cache method causes high memory usage over time

dlamkins commented 2 years ago

Currently we cache all API responses in memory. These entries are regularly expired as appropriate, but larger requests - especially those that are made frequently, can be held in memory for very long periods of time and certain modules cause memory usage to rapidly increase with time.

I think this should be solved a couple of ways:

Module caching expectations:

Modules should be expected to manage their own, likely processed, version of the API results. Seldom are the results needed as is or in full and if the module is expected to use that info again, it should maintain its own internal cache of that information so that it can manage it's own expirations.

Any module that needs regular updates is unlikely to benefit from in-memory caching because they'll be checking for updates on the same interval that the entries are expiring at. For example, if they are requesting Trading Post prices for an item, it is unnecessary for them to check more often than the API is willing to update these values - which matches the response cache time.

As a result, generally speaking, the only time response caching in-memory is very helpful then, is when multiple modules are requesting the same endpoints which is infrequent.

Render service de-duplication:

Any textures which are pulled from the render service are kept in memory more than once. They'll be stored in their raw byte form as they were provided from the API as well as a Texture2d once converted for rendering. Further, any other modules that utilize these textures will have their own converted Texture2d instances resulting in a lot of wasted duplicate texture data.

One option to help resolve this is we keep only a single weak references to texture data post conversion that when requested again can have the resulting texture referenced. This keeps only a single instance active for us to deal with and no redundant byte data to worry about.

Disk based caching:

As it stands, the ArchiveCacheMethod is the only on-disk method of caching (allowing results to persist between Blish HUD sessions). This caching method has a few issues:

Manipulations are lock heavy meaning only a single entry can be written at a time. This is due to there being a single stream position used by the ZipArchive class.
ZipArchive is not async friendly.
The caching method does not appear flush the new entries to disk so at runtime values don't appear to come back as cached when TryGetAsync is called and the data isn't written to the disk until the overlay is closing and the cache method gets disposed. (This may be a misunderstanding of my own review and testing, but I have not had much success with this portion.)

I propose a non-archive based cache method that stores data gzip compressed in directories based on the endpoint path (which conveniently appear to work well with CacheItem Category's and Ids. This would be especially helpful for textures pulled from the render service as it would allow them to load quickly as a local texture. This solution resolves:

No locking is necessary and entries can be simultaneously written and read.
File IO are very async friendly.
Flushing can happen immediately.

This can be used in combination with a modified in-memory caching method as mentioned below.

There are some issues with this:

There are thousands upon thousands of results for some endpoints - all easily under a single kb making them difficult to manage as individual files in bulk.
Responses for all endpoints are cached not only as a single entry which contains all of the results, but also an entry for each individual ID within that result.
Without keeping a registry (which would be an option) of entries, timer based expirations would be difficult and more likely we would rely on removing the entries during TryGetAsync if we notice at that time that the entry has expired. This would result in orphan endpoint results that seldom if ever get cleaned up.

Many of these issues have a smaller impact if we override response cache times and use the selective caching method to ensure short-lived responses aren't kept.

Limited size in-memory caching method

A short lived in-memory caching method can be used to better service simultaneous (or near simultaneous) requests made by modules to the same endpoint. A limit on the total size can be put in place or entries can be aggressively purged. Purge order can be based on size, last request, and how soon until it was going to expire anyways.

Selective caching method

Some endpoints would benefit little from in-memory caching or simply don't make sense to keep in memory. For example, short lived results would be better kept in-memory than on disk as they'll expire soon after anyways.

Overriding response cache times

The proposal here is a middleware which is designed to override response cache times. Map, continent, and other endpoints which are often incredibly large and comprise the majority of loading times for some modules we can expect to remain unchanged the majority of the time.

Other thoughts

Unfortunately, this becomes difficult for requests that attempt to query all results as if there are any additions to that endpoint later on, they will not be included. We can limit the caching override to just the single entries that Gw2Sharp splits out, but they are unlikely (need to confirm) to be queried the next time the module is loaded because the request was done with all and we don't know if what we have locally is all without performing a request for IDs on the endpoint again, which while not entirely self defeating (ID endpoints return much faster than the full endpoints, typically), does hurt response times substantially compared to local. We can't use ManyAsync as a way to plug that whole because we don't know what IDs remain outside of what we have cached until the ID endpoints are queried to get the full list.

This is a long ramble to mostly say that our current implementation is causing high memory usage while still resulting in many cache misses so we may be better off not caching for now. Wanted to present my thoughts in case others had ideas or wanted to discuss this further.

Archomeda commented 2 years ago

If you come up with a new caching solution, or have improvements to the current ones, feel free to PR it to Gw2Sharp 😄 The interfaces are made to be extendable in mind, so you should have a lot of flexibility.

manlaan commented 2 years ago

Another place this seems to become an issue is with some of the daily achievements. Most notable those listed under https://api.guildwars2.com/v2/achievements/groups/18DB115A-8637-4290-A636-821362A3C4A8

The problem becomes that getting the information for the category is cached, example: https://api.guildwars2.com/v2/achievements/categories/252 becomes cached for an hour as per HTTP headers.

Under most circumstances, this is not an issue, except at reset where the information in the category does change. If a module is using this data and expecting new data at reset, this can be delayed up to an hour. (User logs in just before reset and is stuck with old data for an hour after reset).

blish-hud / Blish-HUD