enso-org / enso

Hybrid visual and textual functional programming.
https://enso.org
Apache License 2.0
7.31k stars 318 forks source link

Caching basic `Enso_File` metadata for more efficient access #9598

Open radeusgd opened 3 months ago

radeusgd commented 3 months ago

Currently, every time any Enso File metadata is requested or the file is read, we request the get_file_details cloud endpoint to find the information. This causes a lot of requests when fetching multiple information about a single file.

Moreover, when we find out files through list_directory endpoint (Enso_File.list), that endpoint already returns to us (at least some of) these metadata. But currently Enso discards them and when we want to check e.g. the url to read a file, we will re-fetch it.

Thus a use case of downloading a directory containing N files will use 2N+1 requests (1 request to list the dir, N requests to get file details for the presigned urls for each file, N requests to download each file), whereas with some caching we could get this down to N+1 requests (1 request to list the dir, keeping the presigned urls, and N requests to download each file).

One caveat of that approach is that by introducing a cache, we may miss changes happening 'live' on the Cloud - e.g. if another user is modifying a file, we can see its old contents or metadata like size/update time until the cache expires. We need to allow users to ignore the cache in case they are working with more 'live' data.

radeusgd commented 2 months ago

Most of this was already implemented as part of #9686.

What remains is how to best allow the user to control this in the GUI.

Currently we use global state that is not well integrated with GUI. We may want to consider the suggested my_file.with_cache_timeout and my_file.without_caching methods.