blackflux / s3-cached

S3 File Access Abstraction providing Memory and Disk Caching Layer
MIT License
21 stars 7 forks source link

multiplication of cached files with same s3 key #416

Closed maxime-guyot closed 5 years ago

maxime-guyot commented 5 years ago

Hi,

I'm testing your library, and i don't know why with the same code executed and same S3 key, the file is cached in multiple files ?

const s3P8 = require("s3-cached")({
    bucket: process.env.APP_AWS_S3_BUCKETNAME,
    diskTmpDirectory: "/tmp/P8-cache",
    s3Options: s3Configuration
});

await s3P8.getJsonObjectCached(process.env.APP_APN_P8_S3_KEY);

Capture d’écran 2019-03-27 à 11 40 48

maxime-guyot commented 5 years ago

OK, I investigate, the problem comes from lib 'cache-manager-fs'. This stores options in memory and recreates the .dat file each time an instance is started.

I changed and tested with 'cache-manager-fs-hash', this fix the problem. could you change it ?

simlu commented 5 years ago

This isn't only the file. This is the whole cache. So there is other information indexed and initialized by the underlying caching library. I.e. expiration time or fast lookup. Implementing a fully featured file cache is not trivial. Here is the file cache specific logic (but thats only an implementation of the abstract cache library): https://github.com/hotelde/node-cache-manager-fs/blob/master/index.js

That's where I'd start digging if you want to better understand what is going on. This file cache doesn't need to store its indices in memory. It's gated by an in memory cache, but the file cache doesn't know that.

Tldr What you are seeing is very normal and anticipated behaviour. I'm closing this ticket, but feel free to comment further on here.

simlu commented 5 years ago

Just saw your comment and reopening. Can you please explain what your concern / issue is? I'm having a hard time understanding the use case and why you worry about it.

maxime-guyot commented 5 years ago

I want to use the lib with AWS lambda to store s3 files in /tmp cache. Lambda execute process each time and the data store in memory is lost...

simlu commented 5 years ago

Sorry, but that statement is false.

Memory is not lost. Everything initialized outside your handler stays as long as the lambda function stays warm.

Only when the lambda function cold starts all data is lost (memory and tmp folder). It would then need to be fetched from s3.

It sounds like you might be initializing your cache inside the lambda function? What happens if you initialize it outside?

maxime-guyot commented 5 years ago

I did not have this info about memory, it's cool!

So, it's ok 👍 Thanks

simlu commented 5 years ago

No worries. Good luck. Closing this ticket again