Global resources caching infrastructure with AWS S3 for storage and Google Cloud Spanner for the DB.
A centralized file download cacher that stores resources for distributed readers using Cloud Spanner for global availability.
It can be deployed as a backend web app on your laptop or any cloud provider. Make requests as a pass-through "proxy" for your web resources.
Fetching some resources outside of your CDN can be expensive e.g. if you are processing billions of image assets for your customers within a network and would prefer centralized resources, or crawling the web at the end of everyday.
Due to the intensivity and contention, and deployment of processing on AWS EC2, we need to store resources on AWS S3. However the workers process on Google Cloud Platform plus Cloud Spanner is super fast and globally available so using it as the DB.
When a user requests for a URL through cacher, it first checked locally and if present, it is served, otherwise it'll be downloaded while being proxied back.
$ curl -X POST http://localhost:9444 --data '{"url":"https://orijtech.com/images/logoCenter.png"}'
{
"original_url":"https://orijtech.com/images/logoCenter.png",
"cached_url":"https://cacher-app.s3.amazonaws.com/orijtech.com/adeee3db23c8eb5373aa2675fe2f8394",
"time_at":1520504398
}