Closed arogozhnikov closed 2 years ago
Hi Alex! What do you think about adding a S3_DISABLE_LISTINGS_CACHE environment variable to control this behavior? That way we would have backwards compatibility for people who like the old behavior.
Why would one need to have non-fresh listing? Imagine you use file browser, but it does not show files that were recently created, but shows the ones that were deleted. You click a file, and get an error...
If you have some use case for the old behavior (or maybe any concerns about new behavior), please let me know, but default s3fs behavior just doesn't seem to fit cellxgene-gateway
Why would one need to have non-fresh listing?
@arogozhnikov the reason is for performance. The page will load slightly faster if it does not need to do a new listing each time. This becomes more important as the size of the S3 bucket increases, and as the number of concurrent users grows. We probably need to keep the cache-refresh parameter for the use case where one adds/deletes files from the bucket. I'll try to get these changes into a release soon once we sort out the details.
@alokito I see your concern, added this environment variable and by default behavior is backward-compatible.
Can you please review and merge? Thanks!
@arogozhnikov After sleeping on it, I think your use case is the more common one. I'll merge this, but change the variable name to S3_ENABLE_LISTINGS_CACHE and switch the logic before making the release. Hopefully that will work for everyone.
This prevents from reflecting wrong structure and allows simple incorporation of newly uploaded datasets without need to restart the server