Open AlkaidCheng opened 8 months ago
When input files to RDataFrame are remote files, force caching of remote files does not work and the remote files will be downloaded every time.
RDataFrame
import os import ROOT user = os.environ['USER'] outdir = f"/eos/user/{user[0]}/{user}" filename = os.path.join(outdir, "test.root") # create dummy root file ROOT.RDataFrame(100).Define("x", "1").Snapshot("test", filename) ROOT.TFile.SetCacheFileDir("/tmp", True, True) # this does not trigger loading of cached root file ROOT.RDataFrame("test", f"root://eosuser.cern.ch/{filename}").Sum("x").GetValue()
This is because internally RDataFrame will create a TChain using ROOT.Internal.TreeUtils.MakeChainForMT(treename), which creates a TChain object with the mode ROOT.TChain.kWithoutGlobalRegistration. This in turn forces the TFile open option to be "READ_WITHOUT_GLOBALREGISTRATION". This causes the TFile to be opened without caching since it only checks the fgCacheFileForce flag when option is "READ"
ROOT.Internal.TreeUtils.MakeChainForMT(treename)
TChain
ROOT.TChain.kWithoutGlobalRegistration
6.30/04 (LCG105a)
LCG (Swan)
Linux
No response
I think one possible solution will be to manually edit the options (like here) inside TFile::Open (i.e. somewhere here) so that the _WITHOUT_GLOBALREGISTRATION suffix is not interfering with the remote caching decision.
TFile::Open
_WITHOUT_GLOBALREGISTRATION
Check duplicate issues.
Description
When input files to
RDataFrame
are remote files, force caching of remote files does not work and the remote files will be downloaded every time.Reproducer
This is because internally RDataFrame will create a TChain using
ROOT.Internal.TreeUtils.MakeChainForMT(treename)
, which creates aTChain
object with the modeROOT.TChain.kWithoutGlobalRegistration
. This in turn forces the TFile open option to be "READ_WITHOUT_GLOBALREGISTRATION". This causes the TFile to be opened without caching since it only checks the fgCacheFileForce flag when option is "READ"ROOT version
6.30/04 (LCG105a)
Installation method
LCG (Swan)
Operating system
Linux
Additional context
No response