Novartis / cellxgene-gateway

Cellxgene Gateway allows you to use the Cellxgene Server provided by the Chan Zuckerberg Institute (https://github.com/chanzuckerberg/cellxgene) with multiple datasets.
Apache License 2.0
56 stars 32 forks source link

Force update list of files without restarting server? #59

Closed arogozhnikov closed 2 years ago

arogozhnikov commented 2 years ago

Hello and thanks for a great tool.

I'm currently using our tool to display data in s3 folder.

In case when a new file is produced and uploaded to s3, the list of files is not updated and there is no way to display that data. Unfortunately, restarting gateway is non-trivial. Can you consider adding a button that would update the list of files? Thank you!

alokito commented 2 years ago

Hi @arogozhnikov ! It should be possible to efficiently check for files modified after a particular time in S3 per https://cloudaffaire.com/faq/how-to-list-aws-s3-bucket-contents-by-modified-date/ . I would like to try that first. Implementing an update button would complicate the user interface, as well as the internal interfaces of cellxgene-gateway.

alokito commented 2 years ago

Okay, it seems I was mistaken. After some additional reading, it appears that the query filtering happens client side. I will try to come up with some reasonably generic way to enable clearing of the s3cache. My current thinking is to add a query parameter to the list page, perhaps refresh=true, which tells it to clear the cache before listing the bucket contents. Does this sound reasonable?

alokito commented 2 years ago

@arogozhnikov please take a look at https://github.com/Novartis/cellxgene-gateway/pull/60. The PR adds a "refresh" query param to trigger refresh of the S3 cache. I couldn't think of a generic way to add it to the UI, but you are free to modify the template to add it, or to link directly a url that ends in e.g. `filecrawl.html?refresh=true". Please let me know if this meets your use case and I will merge and cut a new release.

arogozhnikov commented 2 years ago

@alokito thank you, "?refresh=1" worked for me.

Broader, I think it's more wise to just always pass refresh to ls, as listing of folder is very cheap and fast.

Just as with FS, you ask listing each time, not cache locally folder contents. Does it make sense?

alokito commented 2 years ago

@arogozhnikov This should be addressed in v0.3.9, please confirm.

arogozhnikov commented 2 years ago

@alokito just tested and it works. Thank you so much for this tool!