Closed Moelf closed 4 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 84.27%. Comparing base (
0196924
) to head (16eb431
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
julia> a = LazyTree("./ss_U238_p=10bar_Rmin=292mm_Rmax=300mm.root", "G4Sim"); julia> a Row │ postZ postTime preKE preX MeanLife localTime Edep Trac ⋯ │ Float64 Float64 Float64 Float64 Float64 Float64 Float64 Int3 ⋯ ─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ -102.05345950608458 3.0685201317363455e26 0.0 49.73353991915856 2.034191e26 3.0685201317363455e26 0.0 1 ⋯ 2 │ -102.04651578986307 3.0685201317363455e26 4.1979216899649145 49.73353991915856 -1.0 0.0005457291372516042 4.1979216899649145 3 ⋯ 3 │ -102.05346001428659 3.0685201317363455e26 0.07182831002864987 49.73353991915856 3.004037e15 2.3330065525051903e-6 0.07182831002864987 2 ⋯ 4 │ -102.05346001428659 3.0685201317567714e26 0.0 49.733540045379584 3.004037e15 2.0425886986503205e15 0.0 2 ⋯ 5 │ -102.0539008556144 3.0685201317567714e26 0.025275776502425962 49.733540045379584 -1.0 5.298457306211964e-6 0.025275776502425962 6 ⋯ 6 │ -106.84346294108786 3.0685201317567714e26 0.08209409297571678 49.733540045379584 -1.0 0.02146110939511165 0.0 5 ⋯ 7 │ -2000.0 3.0685201317567714e26 0.08209409297571678 47.430551380285756 -1.0 8.503551864822667 0.0 5 ⋯ 8 │ -102.05346001428659 3.0685201317567714e26 1.344596967101097e-7 49.733540045379584 0.0 7.878121781487446e-305 1.344596967101097e-7 4 ⋯ 9 │ -102.05302086535103 3.0685201317567714e26 0.04298589500133121 49.733540045379584 -1.0 1.041804810150761e-5 0.04298589500133121 8 ⋯ 10 │ -102.05346001428659 3.0685201317567714e26 1.050066202878952e-7 49.733540045379584 0.0 7.741051586449725e-306 1.050066202878952e-7 7 ⋯ 11 │ -102.05337759223123 3.0685201317567714e26 0.012766969704394504 49.733540045379584 -1.0 2.2819678030339488e-6 0.012766969704394504 10 ⋯ 12 │ -102.05346003677495 3.0685201317567714e26 3.0297087505459785e-8 49.733540045379584 1.00325e11 0.0002591365343473467 3.0297087505459785e-8 9 ⋯ 13 │ -102.05346003677495 3.0685201317567714e26 0.0 49.73354002365522 1.00325e11 1.414757510502053e10 0.0 9 ⋯ 14 │ -102.0850983567464 3.0685201317567714e26 0.6354402172300221 49.73354002365522 -1.0 0.0002512298153439289 0.09170627685882858 13 ⋯ 15 │ -102.09325779364904 3.0685201317567714e26 0.5437339403711935 49.679954554654216 -1.0 0.00046216017918337645 0.11757367433007723 13 ⋯ 16 │ -102.07580116793557 3.0685201317567714e26 0.42616026604111623 49.63375029065793 -1.0 0.000623815479346648 0.06776978396433882 13 ⋯ 17 │ -102.08504063006555 3.0685201317567714e26 0.35839048207677743 49.59869675442205 -1.0 0.0007576498289719425 0.07490690583766167 13 ⋯ 18 │ -102.07678505845898 3.0685201317567714e26 0.28348357623911574 49.62889262762954 -1.0 0.000861614254051986 0.042674996475252816 13 ⋯ 19 │ -102.09451113711731 3.0685201317567714e26 0.24080857976386294 49.65041254089083 -1.0 0.000948895940544778 0.1282858215721039 13 ⋯ 20 │ -102.09189861553278 3.0685201317567714e26 0.11252275819175905 49.6500709917623 -1.0 0.000985338653087699 0.03837228992128173 13 ⋯ 21 │ -102.08989108494839 3.0685201317567714e26 0.07415046827047733 49.646863130739476 -1.0 0.0010065402178248463 0.047574254567196175 13 ⋯ 22 │ -102.09023049871139 3.0685201317567714e26 0.026576213703281157 49.645250682518146 -1.0 0.0010121881336086595 0.026576213703281157 13 ⋯ 23 │ -103.57478100532907 3.0685201317567714e26 1.6320048207520537 49.73354002365522 -1.0 0.01572578470351275 0.0 12 ⋯ 24 │ -730.8722678518567 3.0685201317567714e26 1.6320048207520537 51.29807043849364 -1.0 6.500054487921658 0.0 12 ⋯ 25 │ -102.0534596801017 3.0685201317567714e26 5.041423719376326e-6 49.73354002365522 1.117712e22 0.0002572019758040831 5.041423719376326e-6 11 ⋯ 26 │ -102.0534596801017 3.0685363408853736e26 0.0 49.73354012060899 1.117712e22 1.6209128602251994e21 0.0 11 ⋯ 27 │ -102.04642108042141 3.0685363408853736e26 4.7745444062238676 49.73354012060899 -1.0 0.000606515507465791 4.7745444062238676 15 ⋯ 28 │ -102.05346024604968 3.0685363408853736e26 0.08312559378100559 49.73354012060899 3.432811e21 2.7993134822468136e-6 0.08312559378100559 14 ⋯ 29 │ -102.05346024604968 3.0685467785417935e26 0.0 49.73354048515049 3.432811e21 1.0437656419799024e21 0.0 14 ⋯ 30 │ -102.0578740887279 3.0685467785417935e26 4.686717724997379 49.73354048515049 -1.0 0.0005971372135061732 4.686717724997379 17 ⋯ 31 │ -102.05345986634265 3.0685467785417935e26 0.08304227498592809 49.73354048515049 7.284479e19 2.8973621866059515e-6 0.08304227498592809 16 ⋯ ⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱ 14 columns and 117333 rows omitted julia> a.Process 117364-element LazyBranch{SubArray{Char, 1, Vector{Char}, Tuple{UnitRange{Int64}}, true}, UnROOT.Nooffsetjagg, ArraysOfArrays.VectorOfVectors{Char, Vector{Char}, Vector{Int32}, Vector{Tuple{}}}}: ['\x0f', 'R', 'a', 'd', 'i', 'o', 'a', 'c', 't', 'i', 'v', 'a', 't', 'i', 'o', 'n'] ['\a', 'i', 'o', 'n', 'I', 'o', 'n', 'i'] ['\a', 'i', 'o', 'n', 'I', 'o', 'n', 'i'] ['\x0f', 'R', 'a', 'd', 'i', 'o', 'a', 'c', 't', 'i', 'v', 'a', 't', 'i', 'o', 'n'] ... julia> a.Particle 117364-element LazyBranch{SubArray{Char, 1, Vector{Char}, Tuple{UnitRange{Int64}}, true}, UnROOT.Nooffsetjagg, ArraysOfArrays.VectorOfVectors{Char, Vector{Char}, Vector{Int32}, Vector{Tuple{}}}}: ['\x04', 'U', '2', '3', '8'] ['\x05', 'a', 'l', 'p', 'h', 'a'] ['\x05', 'T', 'h', '2', '3', '4'] ['\x05', 'T', 'h', '2', '3', '4'] ['\x02', 'e', '-'] ['\t', 'a', 'n', 't', 'i', '_', 'n', 'u', '_', 'e'] ...
I think somehow this is still wrong, but not sure how to fix without knowing what the data should be.
Not sure if the question is towards me or if I understood it right but just in case I'm replying. Normally each row of the LazyTree should correspond to an Gean4 interaction, including the Particle that interacted, under which process etc. As such, I would expect the first row of the Particle column to be reconstructed as "U238", of the Process column "Radiactivation" and so on.
Can you show what is expected for each column? I think maybe we're have some offsetting issues or string length encoding mixedup
converting the root file to csv with uproot and reading it with CSV.jl, that's the format I would be expecting
julia> df = CSV.read(datadir("sims/csv/", "ss_U238_p=10bar_Rmin=292mm_Rmax=300mm.csv"), DataFrame);
julia> df[!, [:entry, :fEvent, :Particle, :Process, :Edep, :preVolume, :postVolume, :ParentID, :TrackID, :Charge, :MeanLife, :preKE, :postKE]]
117364×13 DataFrame
Row │ entry fEvent Particle Process Edep preVolume postVolume ParentID TrackID Charge MeanLife preKE postKE
│ Int64 Int64 String15 String15 Float64 String15 String15 Int64 Int64 Float64 Float64 Float64 Float64
────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 0 992 U238 Radioactivation 0.0 SPCMat OutOfWorld 0 1 92.0 2.03419e26 0.0 0.0
2 │ 1 992 alpha ionIoni 4.19792 SPCMat OutOfWorld 1 3 2.0 -1.0 4.19792 0.0
3 │ 2 992 Th234 ionIoni 0.0718283 SPCMat OutOfWorld 1 2 90.0 3.00404e15 0.0718283 0.0
4 │ 3 992 Th234 Radioactivation 0.0 SPCMat OutOfWorld 1 2 90.0 3.00404e15 0.0 0.0
5 │ 4 992 e- eIoni 0.0252758 SPCMat OutOfWorld 2 6 -1.0 -1.0 0.0252758 0.0
6 │ 5 992 anti_nu_e Transportation 0.0 SPCMat World 2 5 0.0 -1.0 0.0820941 0.0820941
7 │ 6 992 anti_nu_e Transportation 0.0 World OutOfWorld 2 5 0.0 -1.0 0.0820941 0.0820941
8 │ 7 992 Pa234[166.720X] Radioactivation 1.3446e-7 SPCMat OutOfWorld 2 4 91.0 0.0 1.3446e-7 0.0
9 │ 8 992 e- eIoni 0.0429859 SPCMat OutOfWorld 4 8 -1.0 -1.0 0.0429859 0.0
10 │ 9 992 Pa234[103.420X] Radioactivation 1.05007e-7 SPCMat OutOfWorld 4 7 91.0 0.0 1.05007e-7 0.0
11 │ 10 992 e- eIoni 0.012767 SPCMat OutOfWorld 7 10 -1.0 -1.0 0.012767 0.0
12 │ 11 992 Pa234[73.920X] ionIoni 3.02971e-8 SPCMat OutOfWorld 7 9 91.0 1.00325e11 3.02971e-8 0.0
13 │ 12 992 Pa234[73.920X] Radioactivation 0.0 SPCMat OutOfWorld 7 9 91.0 1.00325e11 0.0 0.0
14 │ 13 992 e- eIoni 0.0917063 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.63544 0.543734
15 │ 14 992 e- eIoni 0.117574 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.543734 0.42616
16 │ 15 992 e- eIoni 0.0677698 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.42616 0.35839
17 │ 16 992 e- eIoni 0.0749069 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.35839 0.283484
18 │ 17 992 e- eIoni 0.042675 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.283484 0.240809
19 │ 18 992 e- eIoni 0.128286 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.240809 0.112523
20 │ 19 992 e- eIoni 0.0383723 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.112523 0.0741505
21 │ 20 992 e- eIoni 0.0475743 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.0741505 0.0265762
22 │ 21 992 e- eIoni 0.0265762 SPCMat OutOfWorld 9 13 -1.0 -1.0 0.0265762 0.0
23 │ 22 992 anti_nu_e Transportation 0.0 SPCMat World 9 12 0.0 -1.0 1.632 1.632
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
117343 │ 117342 991 e- eIoni 0.00652197 SPCMat OutOfWorld 36 40 -1.0 -1.0 0.00652197 0.0
117344 │ 117343 991 anti_nu_e Transportation 0.0 SPCMat World 36 39 0.0 -1.0 0.010425 0.010425
117345 │ 117344 991 anti_nu_e Transportation 0.0 World OutOfWorld 36 39 0.0 -1.0 0.010425 0.010425
117346 │ 117345 991 Bi210[46.539] ionIoni 1.53959e-8 SPCMat OutOfWorld 36 38 83.0 4.32808 1.53959e-8 0.0
117347 │ 117346 991 Bi210[46.539] Radioactivation 0.0 SPCMat OutOfWorld 36 38 83.0 4.32808 0.0 0.0
117348 │ 117347 991 gamma phot 0.005989 SPCMat OutOfWorld 38 42 0.0 -1.0 0.046539 0.0
117349 │ 117348 991 e- eIoni 0.04055 SPCMat OutOfWorld 42 43 -1.0 -1.0 0.04055 0.0
117350 │ 117349 991 Bi210 ionIoni 5.55883e-9 SPCMat OutOfWorld 38 41 83.0 6.2474e14 5.55883e-9 0.0
117351 │ 117350 991 Bi210 Radioactivation 0.0 SPCMat OutOfWorld 38 41 83.0 6.2474e14 0.0 0.0
117352 │ 117351 991 e- eIoni 0.0364211 SPCMat OutOfWorld 41 46 -1.0 -1.0 0.227227 0.190806
117353 │ 117352 991 e- eIoni 0.0425288 SPCMat OutOfWorld 41 46 -1.0 -1.0 0.190806 0.148277
117354 │ 117353 991 e- eIoni 0.0434705 SPCMat OutOfWorld 41 46 -1.0 -1.0 0.148277 0.104807
117355 │ 117354 991 e- eIoni 0.0417644 SPCMat OutOfWorld 41 46 -1.0 -1.0 0.104807 0.0630426
117356 │ 117355 991 e- eIoni 0.051797 SPCMat OutOfWorld 41 46 -1.0 -1.0 0.0630426 0.0112456
117357 │ 117356 991 e- eIoni 0.0112456 SPCMat OutOfWorld 41 46 -1.0 -1.0 0.0112456 0.0
117358 │ 117357 991 anti_nu_e Transportation 0.0 SPCMat World 41 45 0.0 -1.0 0.933999 0.933999
117359 │ 117358 991 anti_nu_e Transportation 0.0 World OutOfWorld 41 45 0.0 -1.0 0.933999 0.933999
117360 │ 117359 991 Po210 ionIoni 1.29035e-6 SPCMat OutOfWorld 41 44 84.0 1.72484e16 1.29035e-6 0.0
117361 │ 117360 991 Po210 Radioactivation 0.0 SPCMat OutOfWorld 41 44 84.0 1.72484e16 0.0 0.0
117362 │ 117361 991 alpha ionIoni 5.30431 SPCMat OutOfWorld 44 48 2.0 -1.0 5.30431 0.0
117363 │ 117362 991 Pb206 ionIoni 0.103143 SPCMat OutOfWorld 44 47 82.0 -1.0 0.103143 0.0
117364 │ 117363 991 Pb206 NoProcess 0.0 SPCMat OutOfWorld 44 47 82.0 -1.0 0.0 0.0
ohh
I see, it works like this:
julia> Int('\x0f')
15
julia> ['R', 'a', 'd', 'i', 'o', 'a', 'c', 't', 'i', 'v', 'a', 't', 'i', 'o', 'n']
15-element Vector{Char}:
yeah.... ok so I'm currently splitting the Vector{Char}
correctly but each of the inner vector is actually length + characters
Yep, exactly!
@GSavvidis this should be fixed now:
julia> a.Particle
117364-element LazyBranch{String, UnROOT.Nojagg, Vector{String}}:
"U238"
"alpha"
"Th234"
"Th234"
"e-"
"anti_nu_e"
"anti_nu_e"
"Pa234[166.720X]"
"e-"
"Pa234[103.420X]"
"e-"
"Pa234[73.920X]"
it's not super efficient but this should at least work. Let us know if performance with these Strings become an issue
Incredible! Thanks a lot! Will let you know
Sorry for coming late to the party. @Moelf the structure is similar to TString
. The first byte is the length and if it's 255, read the next byte and add it to the expected length (repeat until the next byte is less than 255).
Yeah, I dug out the readtype() and reuse that function now
🙂 should we go for a test file?
the file provided in the comment above is too large (9.8M), still waiting for confirmation that this is fine (usability and performance)
Ah sorry ok :)
I'm having some time constraints so I might need a couple of days to setup my code. In the meantime, it will take me 5 mins to get you a file of 1M(Or even smaller) in size for testing if you want.
no rush -- feel free to make a smaller file (maybe with 10 events is enough) whenever you've tested your actual use case.
sounds good! Thanks
@Moelf Hello, apologies for the delay. I prepared a small file for testing here. I ran my code on the actual files (> 20GB) with Threads.@threads and didn't notice anything related to the performance but I didn't do a proper testing. here's the script I used UnROOT_script.txt
thanks for the smaller test file! so you're saying for your application the current performance is "good enough"?
yes. I had to loop through files of more 20 GB in size and with multi-threading I needed about ~160s (18 threads). I didn't have any obvious slow-down.
fix: https://discourse.julialang.org/t/problem-reading-root-branch-containing-strings-with-unroot-jl/115671/5
I think somehow this is still wrong, but not sure how to fix without knowing what the data should be.