Closed Moelf closed 1 year ago
ah ha, @grasph, looks like this is automatically "fixed" on Julia 1.10:
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.9.3 (2023-08-24)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using UnROOT
[ Info: Precompiling UnROOT [3cd96dde-e98d-4713-81e9-a4a1b0235ce9]
julia> f = @time ROOTFile("./nanoaod_ttbar.root");
0.257723 seconds (304.96 k allocations: 20.962 MiB, 4.06% gc time, 97.23% compilation time: 21% of which was recompilation)
julia> tree = @time LazyTree("./nanoaod_ttbar.root", "Events");
24.559936 seconds (11.57 M allocations: 1.527 GiB, 0.44% gc time, 97.57% compilation time: <1% of which was recompilation)
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.10.0-beta3 (2023-10-03)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> f = @time ROOTFile("./nanoaod_ttbar.root");
0.144006 seconds (235.61 k allocations: 17.266 MiB, 98.26% compilation time)
julia> tree = @time LazyTree("./nanoaod_ttbar.root", "Events");
1.704714 seconds (11.09 M allocations: 1.500 GiB, 5.74% gc time, 54.80% compilation time)
It might be a fun exercise to find out what can we do for v1.9, but notice unlike https://github.com/JuliaHEP/UnROOT.jl/pull/279#issuecomment-1756069804. This is NOT a case of inference failure causing boxing, because the total number of allocation and amount are almost identical:
11.09 M allocations: 1.500 GiB
11.57 M allocations: 1.527 GiB,
What magic is 1.10 doing here, I wonder?
Frames said this on Slack:
I know large numbers of keyword arguments (as in thousands) were causing a lot of problems in for circuit modelling, because circuit models are often super parameterized. And I believe that kicked off at least a round of work to improve compiler handling of that maybe 9-11 months ago. So it could be that work, which did make it into 1.10.
Thanks @Moelf
Some more comments:
It is still 17 times slower than RDataFrame and it does not scale well with the number of branches.
17x seems large but I don't think it's problematic when it's 0.1s vs. 1.4s. Considering ROOT takes so long to compile I'm not surprised. If anything, we can try to add more precompilation. I don't know how much wider does a root file get, NanoAOD has 1.5k branches. I would argue if user deals with 5k columns or whatever, at that point they ought know they should select columns.
Nothing infinitely scales to all cases. The current design has 2 main objective:
display()
The second point is coupled to the first point, because naively you may want to just return Dict
or some light weight DataFrame
, but then your display()
is all messed up, we need control over both types to customize display()
.
Besides the remaining latency with 1.10.X, asking to move to the latest beta version gives a bad impression to the user on Julia, while the issue is that we are pushing Julia to its limits by using long tuples.
I'm saying just wait. It's already Julia 1.10-beta3, so not that far from release.
Finally, I'm happy to have a sink
keyword interface or something, taking inspiration from CSV.jl or Arrow.jl.
But what those 2 packages are NOT good at and don't care at all, is the user experience out of the box (the display, the iteration capability). We probably shouldn't 'do that.
We also should think harder about what function names / apis are actually gonna be called, I think LazyTree is not perfect considering RNTuple will come eventually. You want to convey 2 things:
QuickRead(file, treename)
is meant for column access, for-loop is going to be slowTypedTree(tree)
is meant for row-iteration.Alternatively, we can have:
UnROOT.read_root(file, treename, sink)
DictTree
and have a TypedTree
as optionNanoAOD has 1.5k branches
Just curious, why would anyone make that many branches? Are they used to emulate what should be arrays, or so? I'm not saying we shouldn't be able to handle it well, I'm just surprised at the sheer number.
they include systematics, there are O(10) different types of objects you reconstruct, each has a few dozens of systematics. And then there are all the L1 trigger and HLT triggers. Many branches are already jagged. https://cms-nanoaod-integration.web.cern.ch/integration/master/mc94X_doc.html
And we can't really do hierarchies - I get it.
@grasph I honestly think you also just want to tell them to select, like, it's absurd to just open all the branches unless you're first day on the job.
I mean first of all tutorials should be given with slimmed files, and second, we don't have CMS nanoAOD systematics CP tools anyway, so it's pointless to read all branches
Related issue: #272
These improvements with 1.10.x are great. Thanks, Jerry, for the investigations.
I think we should still have a low-latency option especially to address the data-frame like analysis, for which fast row access is not required. It can be a parameter to LazyTree constructor to switch to this option.
I made some tests showing that with some pre-compilations (for common column types) and untyped columns, we can reach a 0.3s setup time. Measurements (done with Julia 1.9.1) within the comments of the code below.
using UnROOT
file = ROOTFile("nanoaod.root");
tree = file["Events"];
brnames = keys(tree);
function f1(file, t, brs)
res = Vector{LazyBranch}(undef, length(brs))
for (i,x) in enumerate(brs)
res[i] = LazyBranch(file, t[x])
end
res
end
function f4(file, t, brs)
res = Vector{Pair{Symbol, LazyBranch}}(undef, length(brs))
for (i,x) in enumerate(brs)
res[i] = (Symbol(x) => LazyBranch(file, t[x]))
end
res
end
#this list of branches cover all branch types found in the nanoaod.root test file
brwarmup = ["run","event","BeamSpot_type","BeamSpot_sigmaZ","nboostedTau","boostedTau_idAntiEle2018","boostedTau_jetIdx","boostedTau_charge","boostedTau_chargedIso","Electron_convVeto","TrigObj_id","L1Reco_step"]
# Timing provided in comments were measured with only one group
# of code between two #----... lines included. UnROOT v0.10.15
#----------------------------------------------------------------------
# 0.78s
@time f1(file, tree, brwarmup)
# 0.27s
@time brs = f1(file, tree, brnames);
#----------------------------------------------------------------------
#----------------------------------------------------------------------
##1.05 s
#@time brs = f1(file, tree, brnames);
##0.27 s
#@time brs = f1(file, tree, brnames);
#----------------------------------------------------------------------
#----------------------------------------------------------------------
#using DataFrames
## 1.04s
#@time "dict warmup" wu = Dict(f4(file, tree, brwarmup));
##0.95 seconds (1.09 M allocations: 73.972 MiB, 2.18% gc time, 99.83% compilation time)
#@time "warmup df wrapping" df = DataFrame(wu, copycols=false);
## 0.27s
#@time "dict" brs = Dict(f4(file, tree, brnames));
## 0.0015s (0.9 seconds without the dataframe-wrapping warmup)
#@time "df wrapping" df = DataFrame(brs, copycols=false);
## 0.15s
#@time "sum" sum(df.nMuon)
#----------------------------------------------------------------------
#----------------------------------------------------------------------
#using DataFrames
## 1.9s
#@time "DataFrame warmup" DataFrame(f4(file, tree, brwarmup), copycols=false);
## 0.28s
#@time "DataFrame" df = DataFrame(f4(file, tree, brnames), copycols=false);
## 0.15s
#@time "sum" sum(df.nMuon)
#----------------------------------------------------------------------
#----------------------------------------------------------------------
##1.34 s
#@time "Dict warmup" brs = Dict(f4(file, tree, brnames));
##0.29s
#@time "Dict" brs = Dict(f4(file, tree, brnames));
##0.21s
#@time "sum" sum(brs[:nMuon])
#----------------------------------------------------------------------
Thanks a lot! Can this wait? We can discuss with @tamasgal at juliahep workshop maybe
It would be good to have a solution before the workshop and we can discuss about improvements there.
The main contributors to the number of branches of the CMS nanoad are actually the trigger line flags: it has one branch per line and trigger many lines.
Although the setup overhead is large compared to the time to read all branches, the main point is not about reading all of them, but to have access to all the branches from the LazyTree object.
The issue with the display comes also from the compilation time: it took me 279s to call show(::LazyTree) with UnROOT v.0.10.17 and Julia 1.9.1 and 148s with UnROOT v.0.10.17 and Julia 1.10.0beta3. So 1.10.x does NOT solve the issue for the display and when the LazyTree is instantiated interactively without an ending ';'.
The issue is twofold:
Current solution for (1) is to pass a LazyTree to the constructor of the Table of choice. It's inefficient if LazyTree instantiation takes time. This can be solved by providing a method to fill any container following the Tables interface ala CSV.read or a method like the f4() method of my previous comment. It is simple to implement and quite independent from the existing code.
(2) can be addressed by providing two options for LazyTree: with typed (current NamedTuple option) and untyped columns. The code is more involved, but I have already a prototype.
I definitely tend to towards the first approach. Something like a generic interface which can be turned into LazyTree or DataFrame
similar to what you mentioned with CSV.read
.
I have not spent enough time on this problem yet to be honest. I use UnROOT on a daily basis but only a with a handful (but fairly large and complex) branches and for those of course it works totally fine.
ok what about this temporary stopgap solution:
we add a new keyword argument so something like:
LazyTree(file, treename; sink = ...)
if !isnothing(sink)
, then after forming the Dict{Symbol, LazyBranch}
, we call the sink
constructor. The (mental) interface here is that the sink should be compatible with ColumnTable, so the sink()
must be able to take in our Dict.
This has two advantages:
CSV.read
You would expect the LazyTree() function to returns a LazyTree instance, so I think it's cleaner to have a new function e.g., getTree(), that will take a sink type. If the sink type is LazyTree() then it will do the same as LazyTree().
You mentioned breaking changes, all the proposals I've made so far preserve backward compatibility.
(f4() is also a good option)
I agree. But I don't know what we should name the getTree()
function yet, which is why I'm trying to defer naming this until 1.0.
I think effectively this is a getTreeOrRNTuple()
, but we should give it a good name so it's not too confusing that it does something different than plain getindex()
We can pick up a name and write in the documentation that it can change. An alternative name can be readTree(), although the reading is deferred.
again, readTree
doesn't capture the fact you read RNTuple
use the same function.
If we pick up a name and make it public API, that's potentially one more thing to break on 1.0
same as for LazyTree ;)
Anyway, let's just add a sink option to LazyTree as you proposed. I've updated the code in my fork sink-option branch. Let me know if I should make a PR. See changes here (only one file modified).
There is still an issue with the master (b59a63). Below the LazyTree call during with sink-opt patch on top of v0.10.15 and master for my nano AOD file.
Table type | master | v0.10.15 |
---|---|---|
LazyTree | 120 s | 21 s |
DataFrames.DataFrame | 100 s | 2.3s |
same as for LazyTree ;)
But we already have LazyTree, so adding a sink options is not breaking, we will just recommend a new interface at 1.0.
But if you add getTree() and remove it at 1.0, that's an additional breaking.
Can you show more complete code for what you're doing with Data frame? I can investigate soon
The issue came from the merge. I merged with origin/master instead of origin/main. Now it's fixed. I'll do the PR.
PS: by investigating, I've found out why the TDirectory cache, now removed, was not working properly.
Originally posted by @grasph in https://github.com/JuliaHEP/UnROOT.jl/issues/273#issuecomment-1755720345