caching on server - Githubissues

pachadotdev / tradestatistics-plumber-api

tradestatistics.io API, reads from PostgreSQL and provides tidy CSV and Apache Arrow data

https://api.tradestatistics.io/

Apache License 2.0

2 stars 1 forks source link

caching on server #4

Closed pachadotdev closed 4 years ago

pachadotdev commented 4 years ago

@jbkunst @eliocamp here's the API code, I'll email you a file with the most common queries obtained from pg_stats, which is the only source for analytics as this project does not use trackers or anything.

eliocamp commented 4 years ago

I thing you could just memoise the hell out of every function inside the API using a filesystem cache and be done with it. Queries will be cached and saved as they come in. To save space, you could periodically clean up older (in las read date) caches.

I'm not exactly sure how to go about implementing and testing all that. What do I need to install and run in my machine?

pachadotdev commented 4 years ago

Hi @eliocamp To test on your machine you would only need to change the connection parameters here https://github.com/tradestatistics/plumber-api/blob/master/queries.R#L17-L23 and run install.packages("plumber"). I can provide an ssh access to test locally. Something like ssh -L 5432:foobar.com:5432 me@foobar.com works.

pachadotdev commented 4 years ago

@jbkunst Hi! Do you think this suffices for server-side caching? https://github.com/tradestatistics/plumber-api/blob/master/queries.R#L188-L192

jbkunst commented 4 years ago

Not sure why you put that inside a particular function.

I think you can put the next code at the beginning of the script:

dbGetQuery_cached <- memoise::memoise(dbGetQuery, cache = memoise::cache_filesystem("cache"))

And then replace all the calls:

 data <- dbGetQuery(pool, query)

with

 data <- dbGetQuery_cached(pool, query)

pachadotdev commented 4 years ago

@jbkunst it flies, you were right!!