lmmx / pdf2uplambda

An AWS Lambda function for pdf2up
MIT License
1 stars 0 forks source link

Deterministic image filenames and subsequent S3 caching #2

Open lmmx opened 2 years ago

lmmx commented 2 years ago

Currently the image filenames are calculated based on the output of pdf2up.conversion.pdf2png

https://github.com/lmmx/pdf2uplambda/blob/b0854c2d80fad15947b1caba0e8976153274d64a/pdf2up_lambda/lambda_function.py#L42

However pdf2up will always give the same names, so we can precompute the output paths, so we don't need to wait for the processing to run to get the paths so we can check if the paths already exist before trying to create them.

This therefore means that we can skip (re)computing images altogether if the lambda has already been run for them (within the bucket lifecycle policy time period) and therefore give an instant result in the browser.

In this way we can provide nice features (like #1) that generate a short-term 'gallery' of papers that is turned over daily.

lmmx commented 2 years ago

To implement this I suggest turning pdf2png from a funcdef into a class-based routine, where the paths are exposed as a property.