JuliaHEP / UnROOT.jl

Native Julia I/O package to work with CERN ROOT files objects (TTree and RNTuple)
https://juliahep.github.io/UnROOT.jl/
MIT License
102 stars 17 forks source link

fix reading `TLeafC` #342

Closed Moelf closed 4 months ago

Moelf commented 4 months ago

fix: https://discourse.julialang.org/t/problem-reading-root-branch-containing-strings-with-unroot-jl/115671/5

julia> a = LazyTree("./ss_U238_p=10bar_Rmin=292mm_Rmax=300mm.root", "G4Sim");

julia> a
 Row │ postZ                postTime               preKE                  preX                MeanLife     localTime               Edep                   Trac ⋯
     │ Float64              Float64                Float64                Float64             Float64      Float64                 Float64                Int3 ⋯
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1   │ -102.05345950608458  3.0685201317363455e26  0.0                    49.73353991915856   2.034191e26  3.0685201317363455e26   0.0                    1    ⋯
 2   │ -102.04651578986307  3.0685201317363455e26  4.1979216899649145     49.73353991915856   -1.0         0.0005457291372516042   4.1979216899649145     3    ⋯
 3   │ -102.05346001428659  3.0685201317363455e26  0.07182831002864987    49.73353991915856   3.004037e15  2.3330065525051903e-6   0.07182831002864987    2    ⋯
 4   │ -102.05346001428659  3.0685201317567714e26  0.0                    49.733540045379584  3.004037e15  2.0425886986503205e15   0.0                    2    ⋯
 5   │ -102.0539008556144   3.0685201317567714e26  0.025275776502425962   49.733540045379584  -1.0         5.298457306211964e-6    0.025275776502425962   6    ⋯
 6   │ -106.84346294108786  3.0685201317567714e26  0.08209409297571678    49.733540045379584  -1.0         0.02146110939511165     0.0                    5    ⋯
 7   │ -2000.0              3.0685201317567714e26  0.08209409297571678    47.430551380285756  -1.0         8.503551864822667       0.0                    5    ⋯
 8   │ -102.05346001428659  3.0685201317567714e26  1.344596967101097e-7   49.733540045379584  0.0          7.878121781487446e-305  1.344596967101097e-7   4    ⋯
 9   │ -102.05302086535103  3.0685201317567714e26  0.04298589500133121    49.733540045379584  -1.0         1.041804810150761e-5    0.04298589500133121    8    ⋯
 10  │ -102.05346001428659  3.0685201317567714e26  1.050066202878952e-7   49.733540045379584  0.0          7.741051586449725e-306  1.050066202878952e-7   7    ⋯
 11  │ -102.05337759223123  3.0685201317567714e26  0.012766969704394504   49.733540045379584  -1.0         2.2819678030339488e-6   0.012766969704394504   10   ⋯
 12  │ -102.05346003677495  3.0685201317567714e26  3.0297087505459785e-8  49.733540045379584  1.00325e11   0.0002591365343473467   3.0297087505459785e-8  9    ⋯
 13  │ -102.05346003677495  3.0685201317567714e26  0.0                    49.73354002365522   1.00325e11   1.414757510502053e10    0.0                    9    ⋯
 14  │ -102.0850983567464   3.0685201317567714e26  0.6354402172300221     49.73354002365522   -1.0         0.0002512298153439289   0.09170627685882858    13   ⋯
 15  │ -102.09325779364904  3.0685201317567714e26  0.5437339403711935     49.679954554654216  -1.0         0.00046216017918337645  0.11757367433007723    13   ⋯
 16  │ -102.07580116793557  3.0685201317567714e26  0.42616026604111623    49.63375029065793   -1.0         0.000623815479346648    0.06776978396433882    13   ⋯
 17  │ -102.08504063006555  3.0685201317567714e26  0.35839048207677743    49.59869675442205   -1.0         0.0007576498289719425   0.07490690583766167    13   ⋯
 18  │ -102.07678505845898  3.0685201317567714e26  0.28348357623911574    49.62889262762954   -1.0         0.000861614254051986    0.042674996475252816   13   ⋯
 19  │ -102.09451113711731  3.0685201317567714e26  0.24080857976386294    49.65041254089083   -1.0         0.000948895940544778    0.1282858215721039     13   ⋯
 20  │ -102.09189861553278  3.0685201317567714e26  0.11252275819175905    49.6500709917623    -1.0         0.000985338653087699    0.03837228992128173    13   ⋯
 21  │ -102.08989108494839  3.0685201317567714e26  0.07415046827047733    49.646863130739476  -1.0         0.0010065402178248463   0.047574254567196175   13   ⋯
 22  │ -102.09023049871139  3.0685201317567714e26  0.026576213703281157   49.645250682518146  -1.0         0.0010121881336086595   0.026576213703281157   13   ⋯
 23  │ -103.57478100532907  3.0685201317567714e26  1.6320048207520537     49.73354002365522   -1.0         0.01572578470351275     0.0                    12   ⋯
 24  │ -730.8722678518567   3.0685201317567714e26  1.6320048207520537     51.29807043849364   -1.0         6.500054487921658       0.0                    12   ⋯
 25  │ -102.0534596801017   3.0685201317567714e26  5.041423719376326e-6   49.73354002365522   1.117712e22  0.0002572019758040831   5.041423719376326e-6   11   ⋯
 26  │ -102.0534596801017   3.0685363408853736e26  0.0                    49.73354012060899   1.117712e22  1.6209128602251994e21   0.0                    11   ⋯
 27  │ -102.04642108042141  3.0685363408853736e26  4.7745444062238676     49.73354012060899   -1.0         0.000606515507465791    4.7745444062238676     15   ⋯
 28  │ -102.05346024604968  3.0685363408853736e26  0.08312559378100559    49.73354012060899   3.432811e21  2.7993134822468136e-6   0.08312559378100559    14   ⋯
 29  │ -102.05346024604968  3.0685467785417935e26  0.0                    49.73354048515049   3.432811e21  1.0437656419799024e21   0.0                    14   ⋯
 30  │ -102.0578740887279   3.0685467785417935e26  4.686717724997379      49.73354048515049   -1.0         0.0005971372135061732   4.686717724997379      17   ⋯
 31  │ -102.05345986634265  3.0685467785417935e26  0.08304227498592809    49.73354048515049   7.284479e19  2.8973621866059515e-6   0.08304227498592809    16   ⋯
  ⋮  │          ⋮                     ⋮                      ⋮                    ⋮                ⋮                 ⋮                       ⋮               ⋮ ⋱
                                                                                                                              14 columns and 117333 rows omitted

julia> a.Process
117364-element LazyBranch{SubArray{Char, 1, Vector{Char}, Tuple{UnitRange{Int64}}, true}, UnROOT.Nooffsetjagg, ArraysOfArrays.VectorOfVectors{Char, Vector{Char}, Vector{Int32}, Vector{Tuple{}}}}:
 ['\x0f', 'R', 'a', 'd', 'i', 'o', 'a', 'c', 't', 'i', 'v', 'a', 't', 'i', 'o', 'n']
 ['\a', 'i', 'o', 'n', 'I', 'o', 'n', 'i']
 ['\a', 'i', 'o', 'n', 'I', 'o', 'n', 'i']
 ['\x0f', 'R', 'a', 'd', 'i', 'o', 'a', 'c', 't', 'i', 'v', 'a', 't', 'i', 'o', 'n']
...

julia> a.Particle
117364-element LazyBranch{SubArray{Char, 1, Vector{Char}, Tuple{UnitRange{Int64}}, true}, UnROOT.Nooffsetjagg, ArraysOfArrays.VectorOfVectors{Char, Vector{Char}, Vector{Int32}, Vector{Tuple{}}}}:
 ['\x04', 'U', '2', '3', '8']
 ['\x05', 'a', 'l', 'p', 'h', 'a']
 ['\x05', 'T', 'h', '2', '3', '4']
 ['\x05', 'T', 'h', '2', '3', '4']
 ['\x02', 'e', '-']
 ['\t', 'a', 'n', 't', 'i', '_', 'n', 'u', '_', 'e']
...

I think somehow this is still wrong, but not sure how to fix without knowing what the data should be.

codecov[bot] commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 84.27%. Comparing base (0196924) to head (16eb431).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #342 +/- ## ========================================== + Coverage 84.23% 84.27% +0.03% ========================================== Files 19 19 Lines 2557 2563 +6 ========================================== + Hits 2154 2160 +6 Misses 403 403 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

GSavvidis commented 4 months ago

fix: https://discourse.julialang.org/t/problem-reading-root-branch-containing-strings-with-unroot-jl/115671/5

julia> a = LazyTree("./ss_U238_p=10bar_Rmin=292mm_Rmax=300mm.root", "G4Sim");

julia> a
 Row │ postZ                postTime               preKE                  preX                MeanLife     localTime               Edep                   Trac ⋯
     │ Float64              Float64                Float64                Float64             Float64      Float64                 Float64                Int3 ⋯
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1   │ -102.05345950608458  3.0685201317363455e26  0.0                    49.73353991915856   2.034191e26  3.0685201317363455e26   0.0                    1    ⋯
 2   │ -102.04651578986307  3.0685201317363455e26  4.1979216899649145     49.73353991915856   -1.0         0.0005457291372516042   4.1979216899649145     3    ⋯
 3   │ -102.05346001428659  3.0685201317363455e26  0.07182831002864987    49.73353991915856   3.004037e15  2.3330065525051903e-6   0.07182831002864987    2    ⋯
 4   │ -102.05346001428659  3.0685201317567714e26  0.0                    49.733540045379584  3.004037e15  2.0425886986503205e15   0.0                    2    ⋯
 5   │ -102.0539008556144   3.0685201317567714e26  0.025275776502425962   49.733540045379584  -1.0         5.298457306211964e-6    0.025275776502425962   6    ⋯
 6   │ -106.84346294108786  3.0685201317567714e26  0.08209409297571678    49.733540045379584  -1.0         0.02146110939511165     0.0                    5    ⋯
 7   │ -2000.0              3.0685201317567714e26  0.08209409297571678    47.430551380285756  -1.0         8.503551864822667       0.0                    5    ⋯
 8   │ -102.05346001428659  3.0685201317567714e26  1.344596967101097e-7   49.733540045379584  0.0          7.878121781487446e-305  1.344596967101097e-7   4    ⋯
 9   │ -102.05302086535103  3.0685201317567714e26  0.04298589500133121    49.733540045379584  -1.0         1.041804810150761e-5    0.04298589500133121    8    ⋯
 10  │ -102.05346001428659  3.0685201317567714e26  1.050066202878952e-7   49.733540045379584  0.0          7.741051586449725e-306  1.050066202878952e-7   7    ⋯
 11  │ -102.05337759223123  3.0685201317567714e26  0.012766969704394504   49.733540045379584  -1.0         2.2819678030339488e-6   0.012766969704394504   10   ⋯
 12  │ -102.05346003677495  3.0685201317567714e26  3.0297087505459785e-8  49.733540045379584  1.00325e11   0.0002591365343473467   3.0297087505459785e-8  9    ⋯
 13  │ -102.05346003677495  3.0685201317567714e26  0.0                    49.73354002365522   1.00325e11   1.414757510502053e10    0.0                    9    ⋯
 14  │ -102.0850983567464   3.0685201317567714e26  0.6354402172300221     49.73354002365522   -1.0         0.0002512298153439289   0.09170627685882858    13   ⋯
 15  │ -102.09325779364904  3.0685201317567714e26  0.5437339403711935     49.679954554654216  -1.0         0.00046216017918337645  0.11757367433007723    13   ⋯
 16  │ -102.07580116793557  3.0685201317567714e26  0.42616026604111623    49.63375029065793   -1.0         0.000623815479346648    0.06776978396433882    13   ⋯
 17  │ -102.08504063006555  3.0685201317567714e26  0.35839048207677743    49.59869675442205   -1.0         0.0007576498289719425   0.07490690583766167    13   ⋯
 18  │ -102.07678505845898  3.0685201317567714e26  0.28348357623911574    49.62889262762954   -1.0         0.000861614254051986    0.042674996475252816   13   ⋯
 19  │ -102.09451113711731  3.0685201317567714e26  0.24080857976386294    49.65041254089083   -1.0         0.000948895940544778    0.1282858215721039     13   ⋯
 20  │ -102.09189861553278  3.0685201317567714e26  0.11252275819175905    49.6500709917623    -1.0         0.000985338653087699    0.03837228992128173    13   ⋯
 21  │ -102.08989108494839  3.0685201317567714e26  0.07415046827047733    49.646863130739476  -1.0         0.0010065402178248463   0.047574254567196175   13   ⋯
 22  │ -102.09023049871139  3.0685201317567714e26  0.026576213703281157   49.645250682518146  -1.0         0.0010121881336086595   0.026576213703281157   13   ⋯
 23  │ -103.57478100532907  3.0685201317567714e26  1.6320048207520537     49.73354002365522   -1.0         0.01572578470351275     0.0                    12   ⋯
 24  │ -730.8722678518567   3.0685201317567714e26  1.6320048207520537     51.29807043849364   -1.0         6.500054487921658       0.0                    12   ⋯
 25  │ -102.0534596801017   3.0685201317567714e26  5.041423719376326e-6   49.73354002365522   1.117712e22  0.0002572019758040831   5.041423719376326e-6   11   ⋯
 26  │ -102.0534596801017   3.0685363408853736e26  0.0                    49.73354012060899   1.117712e22  1.6209128602251994e21   0.0                    11   ⋯
 27  │ -102.04642108042141  3.0685363408853736e26  4.7745444062238676     49.73354012060899   -1.0         0.000606515507465791    4.7745444062238676     15   ⋯
 28  │ -102.05346024604968  3.0685363408853736e26  0.08312559378100559    49.73354012060899   3.432811e21  2.7993134822468136e-6   0.08312559378100559    14   ⋯
 29  │ -102.05346024604968  3.0685467785417935e26  0.0                    49.73354048515049   3.432811e21  1.0437656419799024e21   0.0                    14   ⋯
 30  │ -102.0578740887279   3.0685467785417935e26  4.686717724997379      49.73354048515049   -1.0         0.0005971372135061732   4.686717724997379      17   ⋯
 31  │ -102.05345986634265  3.0685467785417935e26  0.08304227498592809    49.73354048515049   7.284479e19  2.8973621866059515e-6   0.08304227498592809    16   ⋯
  ⋮  │          ⋮                     ⋮                      ⋮                    ⋮                ⋮                 ⋮                       ⋮               ⋮ ⋱
                                                                                                                              14 columns and 117333 rows omitted

julia> a.Process
117364-element LazyBranch{SubArray{Char, 1, Vector{Char}, Tuple{UnitRange{Int64}}, true}, UnROOT.Nooffsetjagg, ArraysOfArrays.VectorOfVectors{Char, Vector{Char}, Vector{Int32}, Vector{Tuple{}}}}:
 ['\x0f', 'R', 'a', 'd', 'i', 'o', 'a', 'c', 't', 'i', 'v', 'a', 't', 'i', 'o', 'n']
 ['\a', 'i', 'o', 'n', 'I', 'o', 'n', 'i']
 ['\a', 'i', 'o', 'n', 'I', 'o', 'n', 'i']
 ['\x0f', 'R', 'a', 'd', 'i', 'o', 'a', 'c', 't', 'i', 'v', 'a', 't', 'i', 'o', 'n']
...

julia> a.Particle
117364-element LazyBranch{SubArray{Char, 1, Vector{Char}, Tuple{UnitRange{Int64}}, true}, UnROOT.Nooffsetjagg, ArraysOfArrays.VectorOfVectors{Char, Vector{Char}, Vector{Int32}, Vector{Tuple{}}}}:
 ['\x04', 'U', '2', '3', '8']
 ['\x05', 'a', 'l', 'p', 'h', 'a']
 ['\x05', 'T', 'h', '2', '3', '4']
 ['\x05', 'T', 'h', '2', '3', '4']
 ['\x02', 'e', '-']
 ['\t', 'a', 'n', 't', 'i', '_', 'n', 'u', '_', 'e']
...

I think somehow this is still wrong, but not sure how to fix without knowing what the data should be.

Not sure if the question is towards me or if I understood it right but just in case I'm replying. Normally each row of the LazyTree should correspond to an Gean4 interaction, including the Particle that interacted, under which process etc. As such, I would expect the first row of the Particle column to be reconstructed as "U238", of the Process column "Radiactivation" and so on.

Moelf commented 4 months ago

Can you show what is expected for each column? I think maybe we're have some offsetting issues or string length encoding mixedup

GSavvidis commented 4 months ago

converting the root file to csv with uproot and reading it with CSV.jl, that's the format I would be expecting

julia> df = CSV.read(datadir("sims/csv/", "ss_U238_p=10bar_Rmin=292mm_Rmax=300mm.csv"), DataFrame);

julia> df[!, [:entry, :fEvent, :Particle, :Process, :Edep, :preVolume, :postVolume, :ParentID, :TrackID, :Charge, :MeanLife, :preKE, :postKE]]
117364×13 DataFrame
    Row │ entry   fEvent  Particle         Process          Edep        preVolume  postVolume  ParentID  TrackID  Charge   MeanLife     preKE       postKE    
        │ Int64   Int64   String15         String15         Float64     String15   String15    Int64     Int64    Float64  Float64      Float64     Float64   
────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      1 │      0     992  U238             Radioactivation  0.0         SPCMat     OutOfWorld         0        1     92.0   2.03419e26  0.0         0.0
      2 │      1     992  alpha            ionIoni          4.19792     SPCMat     OutOfWorld         1        3      2.0  -1.0         4.19792     0.0
      3 │      2     992  Th234            ionIoni          0.0718283   SPCMat     OutOfWorld         1        2     90.0   3.00404e15  0.0718283   0.0
      4 │      3     992  Th234            Radioactivation  0.0         SPCMat     OutOfWorld         1        2     90.0   3.00404e15  0.0         0.0
      5 │      4     992  e-               eIoni            0.0252758   SPCMat     OutOfWorld         2        6     -1.0  -1.0         0.0252758   0.0
      6 │      5     992  anti_nu_e        Transportation   0.0         SPCMat     World              2        5      0.0  -1.0         0.0820941   0.0820941
      7 │      6     992  anti_nu_e        Transportation   0.0         World      OutOfWorld         2        5      0.0  -1.0         0.0820941   0.0820941
      8 │      7     992  Pa234[166.720X]  Radioactivation  1.3446e-7   SPCMat     OutOfWorld         2        4     91.0   0.0         1.3446e-7   0.0
      9 │      8     992  e-               eIoni            0.0429859   SPCMat     OutOfWorld         4        8     -1.0  -1.0         0.0429859   0.0
     10 │      9     992  Pa234[103.420X]  Radioactivation  1.05007e-7  SPCMat     OutOfWorld         4        7     91.0   0.0         1.05007e-7  0.0
     11 │     10     992  e-               eIoni            0.012767    SPCMat     OutOfWorld         7       10     -1.0  -1.0         0.012767    0.0
     12 │     11     992  Pa234[73.920X]   ionIoni          3.02971e-8  SPCMat     OutOfWorld         7        9     91.0   1.00325e11  3.02971e-8  0.0
     13 │     12     992  Pa234[73.920X]   Radioactivation  0.0         SPCMat     OutOfWorld         7        9     91.0   1.00325e11  0.0         0.0
     14 │     13     992  e-               eIoni            0.0917063   SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.63544     0.543734
     15 │     14     992  e-               eIoni            0.117574    SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.543734    0.42616
     16 │     15     992  e-               eIoni            0.0677698   SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.42616     0.35839
     17 │     16     992  e-               eIoni            0.0749069   SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.35839     0.283484
     18 │     17     992  e-               eIoni            0.042675    SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.283484    0.240809
     19 │     18     992  e-               eIoni            0.128286    SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.240809    0.112523
     20 │     19     992  e-               eIoni            0.0383723   SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.112523    0.0741505
     21 │     20     992  e-               eIoni            0.0475743   SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.0741505   0.0265762
     22 │     21     992  e-               eIoni            0.0265762   SPCMat     OutOfWorld         9       13     -1.0  -1.0         0.0265762   0.0
     23 │     22     992  anti_nu_e        Transportation   0.0         SPCMat     World              9       12      0.0  -1.0         1.632       1.632
   ⋮    │   ⋮       ⋮            ⋮                ⋮             ⋮           ⋮          ⋮          ⋮         ⋮        ⋮          ⋮           ⋮           ⋮
 117343 │ 117342     991  e-               eIoni            0.00652197  SPCMat     OutOfWorld        36       40     -1.0  -1.0         0.00652197  0.0
 117344 │ 117343     991  anti_nu_e        Transportation   0.0         SPCMat     World             36       39      0.0  -1.0         0.010425    0.010425
 117345 │ 117344     991  anti_nu_e        Transportation   0.0         World      OutOfWorld        36       39      0.0  -1.0         0.010425    0.010425
 117346 │ 117345     991  Bi210[46.539]    ionIoni          1.53959e-8  SPCMat     OutOfWorld        36       38     83.0   4.32808     1.53959e-8  0.0
 117347 │ 117346     991  Bi210[46.539]    Radioactivation  0.0         SPCMat     OutOfWorld        36       38     83.0   4.32808     0.0         0.0
 117348 │ 117347     991  gamma            phot             0.005989    SPCMat     OutOfWorld        38       42      0.0  -1.0         0.046539    0.0
 117349 │ 117348     991  e-               eIoni            0.04055     SPCMat     OutOfWorld        42       43     -1.0  -1.0         0.04055     0.0
 117350 │ 117349     991  Bi210            ionIoni          5.55883e-9  SPCMat     OutOfWorld        38       41     83.0   6.2474e14   5.55883e-9  0.0
 117351 │ 117350     991  Bi210            Radioactivation  0.0         SPCMat     OutOfWorld        38       41     83.0   6.2474e14   0.0         0.0
 117352 │ 117351     991  e-               eIoni            0.0364211   SPCMat     OutOfWorld        41       46     -1.0  -1.0         0.227227    0.190806
 117353 │ 117352     991  e-               eIoni            0.0425288   SPCMat     OutOfWorld        41       46     -1.0  -1.0         0.190806    0.148277
 117354 │ 117353     991  e-               eIoni            0.0434705   SPCMat     OutOfWorld        41       46     -1.0  -1.0         0.148277    0.104807
 117355 │ 117354     991  e-               eIoni            0.0417644   SPCMat     OutOfWorld        41       46     -1.0  -1.0         0.104807    0.0630426
 117356 │ 117355     991  e-               eIoni            0.051797    SPCMat     OutOfWorld        41       46     -1.0  -1.0         0.0630426   0.0112456
 117357 │ 117356     991  e-               eIoni            0.0112456   SPCMat     OutOfWorld        41       46     -1.0  -1.0         0.0112456   0.0
 117358 │ 117357     991  anti_nu_e        Transportation   0.0         SPCMat     World             41       45      0.0  -1.0         0.933999    0.933999
 117359 │ 117358     991  anti_nu_e        Transportation   0.0         World      OutOfWorld        41       45      0.0  -1.0         0.933999    0.933999
 117360 │ 117359     991  Po210            ionIoni          1.29035e-6  SPCMat     OutOfWorld        41       44     84.0   1.72484e16  1.29035e-6  0.0
 117361 │ 117360     991  Po210            Radioactivation  0.0         SPCMat     OutOfWorld        41       44     84.0   1.72484e16  0.0         0.0
 117362 │ 117361     991  alpha            ionIoni          5.30431     SPCMat     OutOfWorld        44       48      2.0  -1.0         5.30431     0.0
 117363 │ 117362     991  Pb206            ionIoni          0.103143    SPCMat     OutOfWorld        44       47     82.0  -1.0         0.103143    0.0
 117364 │ 117363     991  Pb206            NoProcess        0.0         SPCMat     OutOfWorld        44       47     82.0  -1.0         0.0         0.0
Moelf commented 4 months ago

ohh

I see, it works like this:

julia> Int('\x0f')
15

julia> ['R', 'a', 'd', 'i', 'o', 'a', 'c', 't', 'i', 'v', 'a', 't', 'i', 'o', 'n']
15-element Vector{Char}:

yeah.... ok so I'm currently splitting the Vector{Char} correctly but each of the inner vector is actually length + characters

GSavvidis commented 4 months ago

Yep, exactly!

Moelf commented 4 months ago

@GSavvidis this should be fixed now:

julia> a.Particle
117364-element LazyBranch{String, UnROOT.Nojagg, Vector{String}}:
 "U238"
 "alpha"
 "Th234"
 "Th234"
 "e-"
 "anti_nu_e"
 "anti_nu_e"
 "Pa234[166.720X]"
 "e-"
 "Pa234[103.420X]"
 "e-"
 "Pa234[73.920X]"

it's not super efficient but this should at least work. Let us know if performance with these Strings become an issue

GSavvidis commented 4 months ago

Incredible! Thanks a lot! Will let you know

tamasgal commented 4 months ago

Sorry for coming late to the party. @Moelf the structure is similar to TString. The first byte is the length and if it's 255, read the next byte and add it to the expected length (repeat until the next byte is less than 255).

Moelf commented 4 months ago

Yeah, I dug out the readtype() and reuse that function now

tamasgal commented 4 months ago

🙂 should we go for a test file?

Moelf commented 4 months ago

the file provided in the comment above is too large (9.8M), still waiting for confirmation that this is fine (usability and performance)

tamasgal commented 4 months ago

Ah sorry ok :)

GSavvidis commented 4 months ago

I'm having some time constraints so I might need a couple of days to setup my code. In the meantime, it will take me 5 mins to get you a file of 1M(Or even smaller) in size for testing if you want.

Moelf commented 4 months ago

no rush -- feel free to make a smaller file (maybe with 10 events is enough) whenever you've tested your actual use case.

GSavvidis commented 4 months ago

sounds good! Thanks

GSavvidis commented 4 months ago

@Moelf Hello, apologies for the delay. I prepared a small file for testing here. I ran my code on the actual files (> 20GB) with Threads.@threads and didn't notice anything related to the performance but I didn't do a proper testing. here's the script I used UnROOT_script.txt

Moelf commented 4 months ago

thanks for the smaller test file! so you're saying for your application the current performance is "good enough"?

GSavvidis commented 4 months ago

yes. I had to loop through files of more 20 GB in size and with multi-threading I needed about ~160s (18 threads). I didn't have any obvious slow-down.