Closed nsmith- closed 7 months ago
Would it be reasonable to make this a cache with a TTL and/or the cache invalidates on being serialized?
TTL would be good, yes. In the case a loop is available, it should be easy to implement the expiry watchdog with an asyncio.sleep.
We can't use off-the-shelf stuff from cachetools because it is not async-friendly. I found aiocache
but it seems to be targeting something else.
@chrisburr if you can also give this a review that'd be good
FWIW you can probably be much more aggressive with the default TTL, even if it's 5s you're not spamming the server with open requests at that point. Or do studies indicate this is not the case?
5 seconds is very short, I've not
As it stands it looks like a file could be closed while a read is actively running? Maybe you need a semaphore around the file handle so it's only eligable for removal when the count of active users is zero.
As it stands it looks like a file could be closed while a read is actively running? Maybe you need a semaphore around the file handle so it's only eligable for removal when the count of active users is zero.
If we require the ttl
value to be greater than the timeout
, and the LRU max_size=None
, I think this is prevented. This isn't currently the case.
We can also add a semaphore, e.g. returning tuple[client.File, asyncio.Semaphore]
to be used in all async operations and hold the file alive. But perhaps xrootd can do this for us: in the case of attempting to close while reading, either the close or the read operation should fail. If the close operation fails we can just ignore such failures and let the _prune_cache
attempt the close again later. I'll write a test for it in a bit and try it out.
Ok, neither operation fails. I guess it just completes the read. In this case there shouldn't be any issue.
@lobis (and/or @chrisburr ) can you review this?
repinging this
@lobis (and/or @chrisburr ) can you review this?
To prevent constantly opening and closing in calls to cat_file, introduce a fs-level cache of open read-only file handles. If a file handle needs to be opened for writing, we first close any read-only handles to the same path. Since fsspec by design keeps AbstractFileSystem objects alive indefinitely through a metaclass, we implement a TTL mechanism to expire our client's read-only handles so that other clients may have a chance of succeeding in opening the file for writing. If not, other clients may encounter: