orijtech / cacher

Global resources cacher with AWS S3 for storage and Google Cloud Spanner for the DB
Apache License 2.0
3 stars 0 forks source link

cacher

Global resources caching infrastructure with AWS S3 for storage and Google Cloud Spanner for the DB.

A centralized file download cacher that stores resources for distributed readers using Cloud Spanner for global availability.

It can be deployed as a backend web app on your laptop or any cloud provider. Make requests as a pass-through "proxy" for your web resources.

Problem

Fetching some resources outside of your CDN can be expensive e.g. if you are processing billions of image assets for your customers within a network and would prefer centralized resources, or crawling the web at the end of everyday.

Due to the intensivity and contention, and deployment of processing on AWS EC2, we need to store resources on AWS S3. However the workers process on Google Cloud Platform plus Cloud Spanner is super fast and globally available so using it as the DB.

Operation

When a user requests for a URL through cacher, it first checked locally and if present, it is served, otherwise it'll be downloaded while being proxied back.

Uses

Sample usage

$ curl -X POST http://localhost:9444 --data '{"url":"https://orijtech.com/images/logoCenter.png"}'
{
  "original_url":"https://orijtech.com/images/logoCenter.png",
  "cached_url":"https://cacher-app.s3.amazonaws.com/orijtech.com/adeee3db23c8eb5373aa2675fe2f8394",
  "time_at":1520504398
}