CoffeaTeam / fsspec-xrootd

An XRootD implementation for fsspec
https://coffeateam.github.io/fsspec-xrootd/
BSD 3-Clause "New" or "Revised" License
3 stars 5 forks source link

Performance impact of opening a new XRootD `File` in every `cat` operation (TBasket in Uproot) #55

Closed jpivarski closed 4 months ago

jpivarski commented 8 months ago

Reported by @chrisburr in https://github.com/scikit-hep/uproot5/issues/1157#issuecomment-1979731811_:

The rest of the difference is coming from the fsspec source opening the file twice:

[2024-03-05 23:00:45.735325 +0100][Debug  ][ExDbgMsg          ] [ccsrm.ihep.ac.cn:1094] MsgHandler created: 0x15e5ffa0 (message: kXR_open (file: /dpm/ihep.ac.cn/home/lhcb/LHCb/Collision18/LEPTONIC.MDST/00092248/0000/00092248_00002347_1.leptonic.mdst, mode: 00, flags: kXR_open_read kXR_async kXR_retstat ) ).
[2024-03-05 23:00:50.783995 +0100][Debug  ][ExDbgMsg          ] [ccsrm.ihep.ac.cn:1094] MsgHandler created: 0x15f25490 (message: kXR_stat (path: /dpm/ihep.ac.cn/home/lhcb/LHCb/Collision18/LEPTONIC.MDST/00092248/0000/00092248_00002347_1.leptonic.mdst, flags: none) ).
FSSpecSource.chunk(start=0, stop=403)
[2024-03-05 23:00:51.071420 +0100][Debug  ][ExDbgMsg          ] [dpmlhcb01.ihep.ac.cn:1095] MsgHandler created: 0x1781f720 (message: kXR_read (handle: 0x00000000, offset: 0, size: 403) ).
FSSpecSource.chunk(start=3872048242, stop=3872048555)
[2024-03-05 23:00:51.376154 +0100][Debug  ][ExDbgMsg          ] [dpmlhcb01.ihep.ac.cn:1095] MsgHandler created: 0x1781f510 (message: kXR_read (handle: 0x00000000, offset: 3872048242, size: 313) ).
FSSpecSource.chunk(start=3867679172, stop=3867679853)
[2024-03-05 23:00:51.682223 +0100][Debug  ][ExDbgMsg          ] [dpmlhcb01.ihep.ac.cn:1095] MsgHandler created: 0x15f25740 (message: kXR_read (handle: 0x00000000, offset: 3867679172, size: 681) ).
FSSpecSource.chunks(ranges=[(3867678939, 3867679107)])
[2024-03-05 23:00:51.995422 +0100][Debug  ][ExDbgMsg          ] [ccsrm.ihep.ac.cn:1094] MsgHandler created: 0x23f04350 (message: kXR_open (file: /dpm/ihep.ac.cn/home/lhcb/LHCb/Collision18/LEPTONIC.MDST/00092248/0000/00092248_00002347_1.leptonic.mdst, mode: 074, flags: kXR_open_read kXR_async kXR_retstat ) ).
[2024-03-05 23:00:52.588776 +0100][Debug  ][ExDbgMsg          ] [dpmlhcb01.ihep.ac.cn:1095] MsgHandler created: 0x15f25490 (message: kXR_read (handle: 0x01000000, offset: 3867678939, size: 168) ).
[2024-03-05 23:00:52.890174 +0100][Debug  ][ExDbgMsg          ] [dpmlhcb01.ihep.ac.cn:1095] MsgHandler created: 0x17826850 (message: kXR_close (handle: 0x01000000) ).
Took 7.651638916009688 seconds
[2024-03-05 23:00:53.264742 +0100][Debug  ][ExDbgMsg          ] [dpmlhcb01.ihep.ac.cn:1095] MsgHandler created: 0x27b2c1c0 (message: kXR_close (handle: 0x00000000) ).

The second one is caused by FSSpecSource.chunks calling _cat_file:

https://github.com/scikit-hep/uproot5/blob/e47934f32bd16439a2ca9e92428d2a9a4610a144/src/uproot/source/fsspec.py#L164

which opens a new file behind the scenes for every call:

https://github.com/CoffeaTeam/fsspec-xrootd/blob/b12503eb852f82f0b6bf85e1df02b2b683c1f819/src/fsspec_xrootd/xrootd.py#L392-L399

In my experience, these File objects are heavy; slow to open. fsspec's cat interface is stateless, so it seems that you have to create a new one of these for every call, but that means every TBasket in Uproot.

Is there an alternative that we can use, some multi_cat or a context that holds the File object so that we don't need so many? Is there a way to use XRootD in a lightweight, stateless way (like HTTP connections)?

nsmith- commented 8 months ago

So, I'm going to try to write up my performance study in https://github.com/scikit-hep/uproot5/issues/1157 to collect all in one spot, but yes fsspec has cat_ranges which allows to bypass this particular multi-open issue. I also looked into caching open file handles in #54 but I think this is really unreliable because the server disallows opening a file for writing if any reader is connected, and this cache leaks reader handles. To do it right we need to use (async or regular) context managers for all file access.

nsmith- commented 8 months ago

Is there a way to use XRootD in a lightweight, stateless way (like HTTP connections)?

I spent a bit of time this morning reading the protocol doc https://xrootd.slac.stanford.edu/doc/dev56/XRdv520.pdf and I don't see any obvious way to interact with an xrootd server in a stateless way: one needs to acquire an opaque char[4] fHandle in kXR_open to then pass as an argument to kXR_read.

chrisburr commented 8 months ago

I think this is really unreliable because the server disallows opening a file for writing if any reader is connected

Does anyone actually write ROOT files directly over XRootD? I understand why fsspec-xrootd would want to support it but for uproot a keyword argument on the filesystem class that enables the cache and prevents writing might be enough, at least as a short term fix.

jpivarski commented 8 months ago

It is now technically possible, but I don't think it's a good idea. Writing a ROOT file involves a lot of seeking back and forth (they can't be written directly from beginning to end, unless the sizes of everything that is to be written is known in advance), and that would mean a lot of interaction over the network.

Since it was a requested feature, we can't break it, but we don't need to ensure that it is the optimal path. If reopening the file is necessary for writing but not for reading, that would be fine.

nsmith- commented 8 months ago

Writing files in general over xrootd is a very desired feature. For example, I am writing several GB of parquet files to FNAL EOS storage in my skim example . It works quite well. I would hope we extend uproot writing to support fsspec sinks, using the simplecache local cache feature to only write (commit) the whole file at the end of the writing process.

All that said, I am happy to re-start work on #54

jpivarski commented 8 months ago

That's just the thing: the Parquet format is defined in such a way that all metadata that needs to know the sizes of things gets written after (at larger seek values) than the data it represents. With causal knowledge of only the past, it can be written from the beginning to the end of the file, in order. That can't be done with the ROOT format, especially if the file is to be valid between writing individual objects and if sizes of everything that will be written isn't known in advance.