spark-redshift-community / spark-redshift

Performant Redshift data source for Apache Spark
Apache License 2.0
137 stars 63 forks source link

Feature request cache queries on s3 with TTL #114

Open parisni opened 1 year ago

parisni commented 1 year ago

Once unloaded data to s3, a given query could read from cache if run again.

A TTL config per could trigger a new unload if the delay is greater than specified.

We could hash the query string and add a timestamp to enable reuse and distinguish queries at run time.

This can save a lot of compute, network, cost