root-project / root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
https://root.cern
Other
2.63k stars 1.25k forks source link

Have a more sensible default for `TFileMerger::fMaxOpenedFiles` #11276

Open chaen opened 2 years ago

chaen commented 2 years ago

Explain what you would like to see improved

By default, hadd/TFileMerger will open the maximum number of allowed opened files.

https://github.com/root-project/root/blob/3dd47b9da2519f0e98762a8f0be43971b5ed8f5b/io/io/src/TFileMerger.cxx#L61-L93

This seems too large as a default, and even led to problems with sites (https://ggus.eu/?mode=ticket_info&ticket_id=153653)

Optional: share how it could be improved

Have a more sensible default ? 32 ? 64 ?

pcanal commented 2 years ago

The performance of hadd/TFileMerger for some times of files (eg. with histograms) is proportional to the number of batches of files processed. So reducing it the max number of files could significantly decrease performance.

As a compromise, I propose to reduce the max value to 256 (already a factor 4 slower compared to the typical 1024) ?

Would that be okay for your situation?

onnozweers commented 2 years ago

256 wouldn't have helped in the mentioned GGUS ticket. There was an input set of 133 files, of which more than 24 were on a single pool with a limit of 24 concurrent transfers.

Another approach could be to detect this situation in hadd and then suggest the use of -n.

pcanal commented 2 years ago

I see. So actually this is a case where the limit is not known to the OS/through-getrlimit. How can we detect locally the limit of concurrent transfers for a given server?

pcanal commented 2 years ago

We can not access the original ticket. What is the actual hardware that was being accessed? What is mounted as a local -appearing file system or was it being accessed via a remote protocols (i.e the file name were prefixed with root://...).

In first approximation, I don't how we could detect your use case? If you were able to set ulimit locally to whatever limit fit your need, hadd would automatically adjust.

onnozweers commented 2 years ago

Hi @pcanal,

The input is a list of 133 files in the format root://x509up_u@xrootd.grid.surfsara.nl//pnfs/grid.sara.nl/data/lhcb/LHCb_USER/lhcb/user/v/username/2021_08/520789/520789382/x24mu__wmomsc_a.root.

The limit is in the dCache storage system (xrootd.grid.surfsara.nl), not on the client side. This limit is there for a reason: it prevents the storage pools from being overloaded with transfers and crashing.

When hadd tries to open all files at once, it tries to read more files concurrently than the limit per dCache storage pool allows. The first files are served, but the rest of the transfers are queued. This means, that they remain open but zero bytes are served, until some of the other transfers finish. But hadd never finishes those because it insists on reading from all files at the same time. So it gets stuck into a deadlock situation.

If hadd would detect this situation (I'm getting data for some files but zero bytes for other files), it would make sense to stop reading all files concurrently, but instead continue reading from all files that it can, close those files, and then receive data for the other files.

If hadd could do that, such a deadlock would be prevented, while performance would still be the maximum available.

Cheers, Onno

pcanal commented 1 year ago

This means, that they remain open but zero bytes are served, (I'm getting data for some files but zero bytes for other files)

Do you know if the xrootd routine just "hang" in that case or return with request to retry later? If they just hang there is not much I can see doing to detect the case unless there is an xrootd routine that detect/support this case that we could replace the current call with. (and we would need some help to update the xrootd plugin in ROOT to support and test this).

onnozweers commented 1 year ago

I'm afraid I don't know answer your question. But I understand the difficulty.

There might be a more pragmatic approach: if reading remote files fails, hadd could quit with the suggestion to try again with the -n parameter. There might be problems to which -n is not the solution, but it takes only a minute to try, so why not suggest it.