Discussion: caching optimization

bastianh commented 10 years ago

What do you think about optimizing the caching ?

First thing that comes to mind is caching the parsed result instead of caching the raw xml. Second I thought about an optional locking when doing the web request so when two threads or processes want to fetch the same data and it's not in cache that one request is locked until the first is finished.

stevelle commented 10 years ago

Those are each separate features. best not intermingle the implementations. The first is clearly more important as not every library consumer will multi-process.

metatoaster commented 10 years ago

No, global caching of parsed result objects was quite problematic as it resulted in loss of information, see issue #130 for what was done. You have to consider all the edge cases, and there may be users that use both the low level access then access the higher level access - which object/results to you cache? Hence I think caching the raw objects at the API level is completely the wrong way to approach this.

However if this is deemed important enough maybe someone can create a separate cache class that implement this function with all the tests required.

Better yet, create a cache wrapper class for the API wrapper classes so that all results returned by the access methods will be cached, similar to memoization. This is probably much more feasible and can be implemented completely separately from all existing wrapper classes and since all results are now wrapped around the APIResult class, any compatible methods that return that object type will be able to take advantage of this class.

Second one is completely different beast altogether. You probably want to extend the API object to record the current active API requests and block other threads from starting the call until the previous call have exited, then the other threads will get to run (subject to locking rules) which would then go through the cache if the data was successfully downloaded by the first call, so no, do not do it at the cache level and so this is a completely different item.

bastianh commented 10 years ago

I was talking about caching the APIResult objects... what information would be lost that is otherwise accessible? The timestamp mentioned in #130 is already part of that object.

Since we already cache the lowest level (the http request) and have the cache key which would be suiteable to create the locks it would be very easy to implement locking with the current cache layer. It would be easy to create default non locking cache classes and advanced classes with locking for cache backends that support it like redis. (http://redis.io/topics/distlock)

metatoaster commented 10 years ago

Do note that in the current implementation of API.get, APIResult and APIError are two types of objects that can be created (one is returned and the other is raised as exception) from the cached string, if any. Whatever changes will need to harmonize this somehow.

Also, if you are really concerned about caching optimization you would cache the object that the code would immediately using, hence you want to wrap the API access objects (such as evelink.corp.Corp) using an object that would intercept all calls to it and return any cache hits matching the method being called and the parameters that were used, as searching through that XML tree to get the dict result set you need is still a cost.

ayust commented 10 years ago

Note that caching the result objects would necessitate clearing the cache on every EVELink version upgrade (or else potentially providing inconsistent result objects). Caching at the XML layer means even old cached results will still return the newest object format, as long as the underlying EVE API hasn't changed.

bastianh commented 9 years ago

Well .. there are other occasions where a cache wipe or deprecation becomes necessary. With eve releases every 6 weeks and an active guy working on the api that could happen more often :) Changing the cache key when the library get's updated would be a minor problem since it is already there.

ayust commented 9 years ago

I'm not ruling it out - just pointing out another aspect of the caching layer differences.

On Thu, Oct 23, 2014, 05:21 Bastian Hoyer notifications@github.com wrote:

Well .. there are other occasions where a cache wipe or deprecation becomes necessary. With eve releases every 6 weeks and an active guy working on the api that could happen more often :) Changing the cache key when the library get's updated would be a minor problem since it is already there.

— Reply to this email directly or view it on GitHub https://github.com/eve-val/evelink/issues/190#issuecomment-60230579.

bastianh commented 9 years ago

Fine.. I'm not currently working on it I was just asking for opinions...

another thing I was wondering what you think about adding crest api access to this library.. or if this is out of scope... ;)

ayust commented 9 years ago

I'd love to add CREST coverage if we can do it in a clean way that isn't too different from the XML API flow (i.e. set up an API object, pass it to a top level endpoint object, call methods).

Let's break that out into a separate issue, though. :)

On Thu, Oct 23, 2014, 08:38 Bastian Hoyer notifications@github.com wrote:

Fine.. I'm not currently working on it I was just asking for opinions...

another thing I was wondering what you think about adding crest api access to this library.. or if this is out of scope... ;)

— Reply to this email directly or view it on GitHub https://github.com/eve-val/evelink/issues/190#issuecomment-60258666.

eve-val / evelink

Discussion: caching optimization #190