Closed gusutabopb closed 4 years ago
Merging #33 into master will increase coverage by
0.20%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #33 +/- ##
==========================================
+ Coverage 96.58% 96.78% +0.20%
==========================================
Files 9 9
Lines 556 529 -27
==========================================
- Hits 537 512 -25
+ Misses 19 17 -2
Impacted Files | Coverage Δ | |
---|---|---|
aioinflux/client.py | 94.71% <100.00%> (+0.41%) |
:arrow_up: |
aioinflux/compat.py | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 18a5402...a0d0152. Read the comment docs.
Here's a simple example of a cacheing layer for InfluxDB/aioinflux (which will also be available on the v0.10.0 docs). It works by caching dataframes as compressed pickle files on disk. It can be easily modified to use your preferred caching strategy, such as using different serialization, compression, cache key generation, etc.
See function docstrings, code comments below for more details.
from aioinflux import InfluxDBClient
c = InfluxDBClient(output='dataframe')
q = """
SELECT * FROM executions
WHERE product_code='BTC_JPY'
AND time >= '2020-05-22'
AND time < '2020-05-23'
"""
# If this query is repeated, it will keep hitting InfluxDB,
# increasing the load on instance and using extra bandwidth
df = await c.query(q)
import re
import hashlib
import pathlib
import pandas as pd
def _hash_query(q: str) -> str:
"""Normalizes and hashes the query to generate a caching key"""
q = re.sub("\s+", " ", q).strip().lower().encode()
return hashlib.sha1(q).hexdigest()
async def fetch(influxdb: InfluxDBClient, q: str) -> Tuple[pd.DataFrame, bool]:
"""Tries to see if query is cached, else fetches data from the database.
Returns a tuple containing the query results and a boolean indicating
whether or not the data came from local cache or directly from InfluxDB
"""
p = pathlib.Path(_hash_query(q))
if p.exists():
return pd.read_pickle(p, compression="xz"), True
df = await influxdb.query(q)
df.to_pickle(str(p), compression="xz")
return df, False
df, cached = await fetch(c, q)
print(cached) # False - cache miss
df, cached = await fetch(c, q)
print(cached) # True - cache hit
Aioinflux used to provide a built-in caching local functionality using Redis. However, due to low perceived usage, vendor lock-in (Redis) and extra complexity added to Aioinflux, I have decided to remove it.
Hopefully no one else besides my past self use this functionality. In case someone else did, or in case someone else didn't but may be interested in caching InfluxDB query results, I will add a simple implementation of a simple caching layer using pickle. If this affects you please let me know by commenting below.