kakao / s2graph

This code base is retained for historical interest only, please visit Apache Incubator Repo for latest one
https://github.com/apache/incubator-s2graph
Other
250 stars 32 forks source link

Provide Result Json LocalCache #207

Open SteamShon opened 8 years ago

SteamShon commented 8 years ago

currently s2core provide local cache to reduce I/O requests to storage(HBase). This means even if we hit on cache, we still use lots of cpu to build result json through PostProcess.

many cases, it is beneficial to provide json result cache for data that is not changing dynamically. One example use case is simple mapping table that is not updated frequently. currently, even though we hit on local cache, it is costly to build json result over and over.

I am thinking about provide option on query to specify "ok cache the result json and avoid re-build json over and over".

SteamShon commented 8 years ago

Currently there is three point that I think cache can help on query path.

Query structure follows below.

Query consists of multiple step, and each step consists of multiple queryParam.

Right now, local cache is only supported on queryParam level. even with hit on cache, we still need to aggregate between other queryParams on same step and filter out.

I am suggesting 3 level caches.

  1. queryParam level cache. this cache QueryResult that can be fetched from I/O to storage.
  2. step level cache. this cache aggregated and filtered out result after goes through current step.
  3. query level cache. this cache final json value.

the faster data set is changing, the lower level of cache would be preferred.

Let me know what others think.

SteamShon commented 8 years ago

I was working on this issue and figure out that we don`t need to restrict cache implementation for result cache to local cache. maybe remote cache like memcached or redis can be used for result cache purpose.