CoffeaTeam / fsspec-xrootd

An XRootD implementation for fsspec
https://coffeateam.github.io/fsspec-xrootd/
BSD 3-Clause "New" or "Revised" License
3 stars 5 forks source link

Implement a file handle cache #54

Closed nsmith- closed 6 months ago

nsmith- commented 8 months ago

To prevent constantly opening and closing in calls to cat_file, introduce a fs-level cache of open read-only file handles. If a file handle needs to be opened for writing, we first close any read-only handles to the same path. Since fsspec by design keeps AbstractFileSystem objects alive indefinitely through a metaclass, we implement a TTL mechanism to expire our client's read-only handles so that other clients may have a chance of succeeding in opening the file for writing. If not, other clients may encounter:

[3003] Output file ... is already opened by 1 reader; open denied.
lgray commented 8 months ago

Would it be reasonable to make this a cache with a TTL and/or the cache invalidates on being serialized?

lgray commented 8 months ago

https://cachetools.readthedocs.io/en/latest/#cachetools.TTLCache for instance.

nsmith- commented 8 months ago

TTL would be good, yes. In the case a loop is available, it should be easy to implement the expiry watchdog with an asyncio.sleep. We can't use off-the-shelf stuff from cachetools because it is not async-friendly. I found aiocache but it seems to be targeting something else.

nsmith- commented 8 months ago

@chrisburr if you can also give this a review that'd be good

lgray commented 8 months ago

FWIW you can probably be much more aggressive with the default TTL, even if it's 5s you're not spamming the server with open requests at that point. Or do studies indicate this is not the case?

chrisburr commented 8 months ago

5 seconds is very short, I've not

As it stands it looks like a file could be closed while a read is actively running? Maybe you need a semaphore around the file handle so it's only eligable for removal when the count of active users is zero.

nsmith- commented 8 months ago

As it stands it looks like a file could be closed while a read is actively running? Maybe you need a semaphore around the file handle so it's only eligable for removal when the count of active users is zero.

If we require the ttl value to be greater than the timeout, and the LRU max_size=None, I think this is prevented. This isn't currently the case. We can also add a semaphore, e.g. returning tuple[client.File, asyncio.Semaphore] to be used in all async operations and hold the file alive. But perhaps xrootd can do this for us: in the case of attempting to close while reading, either the close or the read operation should fail. If the close operation fails we can just ignore such failures and let the _prune_cache attempt the close again later. I'll write a test for it in a bit and try it out.

nsmith- commented 8 months ago

Ok, neither operation fails. I guess it just completes the read. In this case there shouldn't be any issue.

nsmith- commented 7 months ago

@lobis (and/or @chrisburr ) can you review this?

lgray commented 6 months ago

repinging this

@lobis (and/or @chrisburr ) can you review this?