I have been hosting the backend for Devo on AWS Lambda as separate lambda functions, each would work on incoming requests. Even though this was easy in the beginning, it was suboptimal for multiple reasons:
Every request requires executing the data fetch operations again, even though for most of the cases the operation returns very similar results. There's not much of a change on Hacker News in 2 minutes, and even if there is, it is not that important for the casual Devo user.
Since each operation fetches the data again, it takes 700ms+ for majority of them, hitting a couple of seconds in some cases.
The functions were deployed in us-east-1 region of AWS, even though out of ~1500 active users on Chrome, only 300 of them are based in US. I am based in Berlin, and from where I am right now, I have ~120ms latency just to ping the us-east-1 region, compared to ~30ms for the datacenters in Europe.
The function usage is linearly scaling with the number of users, and I am already paying for this for the last year or so, meaning that the more people start using Devo, the higher the bills are gonna get. To be fair, the monthly bill for Lambda is less than 10 dollars for the last year, but I'd like to avoid doing that if possible.
This has also caused us to exhaust the rate limits of Product Hunt's API before.
Therefore, I have decided to take some advice from the community and instead of running the data fetch and parse operations on demand via lambda functions, I'd do it as a background job, and serve the resulting files as a static asset over S3. Since the Devo backend was already private because of the tokens used there, I have implemented a new version of the fetch setup which would fetch the data from every platform every 10 minutes and update a static file on S3. For the storage I have used my private DigitalOcean Spaces account, and I am running the jobs in the background without exposing anything to the outside world.
This setup has a couple of benefits:
The files are now served from AMS3 region of DigitalOcean Spaces, which has ~25ms average latency for Europe-based users, meaning much faster data delivery for majority of the users.
The data fetch operations is going to be constant now, not linearly increasing with the number of users. This means more predictable pricing, and potentially easily staying within the free tier of Lambda.
The data is not exposed through AWS, and the only data transfer is from Lambda -> DigitalOcean, which should allow constant transfer rates and much lower pricing as well.
The results are saved as a static file from DigitalOcean Spaces, which means they'll be much much faster for everyone compared to the previous Lambda-based version.
In the future we can introduce a way of utilizing DigitalOcean's CDN for serving the file, giving even better performance to those who are not based in Europe. For now Spaces CDN doesn't seem to support HTTP/2 and I couldn't find a way of enabling that without changing the URL, so I don't know if that would be beneficial to begin with. We can experiment with this if there are people based in US that are willing to help me with it.
I have been testing this setup myself and the reloads are super fast, which is a much better experience, making the page look like the data is always served from the local cache with every reload. I'll move forward with this.
I have been hosting the backend for Devo on AWS Lambda as separate lambda functions, each would work on incoming requests. Even though this was easy in the beginning, it was suboptimal for multiple reasons:
us-east-1
region of AWS, even though out of ~1500 active users on Chrome, only 300 of them are based in US. I am based in Berlin, and from where I am right now, I have ~120ms latency just to ping theus-east-1
region, compared to ~30ms for the datacenters in Europe.Therefore, I have decided to take some advice from the community and instead of running the data fetch and parse operations on demand via lambda functions, I'd do it as a background job, and serve the resulting files as a static asset over S3. Since the Devo backend was already private because of the tokens used there, I have implemented a new version of the fetch setup which would fetch the data from every platform every 10 minutes and update a static file on S3. For the storage I have used my private DigitalOcean Spaces account, and I am running the jobs in the background without exposing anything to the outside world.
This setup has a couple of benefits:
I have been testing this setup myself and the reloads are super fast, which is a much better experience, making the page look like the data is always served from the local cache with every reload. I'll move forward with this.