Open lpetre-ulb opened 5 years ago
One drawback being that
root-numpy
is not actively maintained and might not quickly (if at all) support new releases ofROOT
.
I don't think this matters, as installing root-numpy
does a compilation of any necessary dependencies against the installed version of ROOT
, so if the base ROOT
version has the fix, that's all that should matter.
There was another issue (@bdorney can maybe recall the number) where a proposal was made for a change at the core (e.g., when the read from HW actually happens), and where I suggested implementing the workaround at the highest possible level, which I guess would be in any piece of code that is reading a stored chipID (i.e., not from hardware) and then doing an operation with it (DB lookup). At this level, it is completely safe to mask off the 16 MSBs (properly decoded values will always have only the 16 LSBs set, and bit-flips on non-RM encoded values don't have reliable values anyway, so masking off the MSB or even the 16 MSBs won't break things any more than they already would be)
I have a version of root_numpy
ready to work with root-6-18
, so we can consider trying to fully test the code base against this release.
If it really resolves this issue, and doesn't introduce any additional major problems, i think we could make the decision to migrate production machines to 6.18 as the stated supported version
(these updated packages are installed on gem904daq02
)
The bug is still present in the patch version 00
of root-6.18
, but fixed starting from 6.18.02
.
At ULB, we are successfully using root-6.18.04-1.el7
since weeks (months?). Moreover we exclusively use VFAT3 hybrid v2 with Reed-Muller encoded chipID, so I think this issue can be considered resolved with a newer version of ROOT.
However, I've never tried the gem-light-dqm
which is probably the most sensitive piece of software regarding ROOT upgrades...
OK, in production the light DQM is running on a dedicated machine, independently of any of the DAQ machines (in my ideal world, our DAQ SW would drop all dependencies on ROOT
altogether, but this would be a longer term restructuring...)
To proceed, I think we should consider upgrading one integration machine and then one QC7/8 machine, (downgrading ROOT RPMs is a PITA). Following the success of this, we can migrate the rest of the machines
in my ideal world, our DAQ SW would drop all dependencies on
ROOT
altogether, but this would be a longer term restructuring...
:100:
I guess that if ROOT compatibility is required, uproot
would work very well. However, if there is no need for ROOT compatibility, HDF5 is maybe more compatible. And for the analysis and plotting, numpy
, scipy
and pyplot
are probably more than enough.
To proceed, I think we should consider upgrading one integration machine and then one QC7/8 machine, (downgrading ROOT RPMs is a PITA). Following the success of this, we can migrate the rest of the machines
:+1: I don't foresee any compatibility issue, but the sotware broke too many times these last few days...
(This is a follow-up on https://github.com/cms-gem-daq-project/cmsgemos/issues/297)
Brief summary of issue
When analysing scans (SCurves, but possibly others) with non Reed-Muller encoded chipID's, the analysis routine can fail with the following error:
OverflowError: unsigned long is less than minimum
.Types of issue
Expected Behavior
SCurves analysis can fail with non Reed-Muller encoded chipID.
Current Behavior
SCurves analysis should perform seamlessly even with non Reed-Muller encoded chipID.
Steps to Reproduce (for bugs)
anaUltraScurve.py
on a scan with non Reed-Muller encoded chipID's.1
for at least one VFAT.Possible Solution (for bugs)
The root issue comes from the PyROOT type converters, more precisely a hack for the
enum
types. This is going to be fixed in ROOT 6.19.Indeed,
UInt_t
(which is thevfatID
branch type) are currently interpreted as signed integers and therefore can represent a negative number if the MSB is set to1
.Applying this commit on top of ROOT 6.18 (with its dependent commit and a missing include) fixes the issue. Patched RPM's and SRPM can be found here.
Really fixing the issue requires to either deploy a patched version of ROOT or wait for ROOT 6.19. One drawback being that
root-numpy
is not actively maintained and might not quickly (if at all) support new releases of ROOT.One hack could be to add
2**32
to negative numbers when theTBranch
is supposed to contain 32-bits unsigned integers. This would have to be done everywhere... Another hack could be to store the chipID's as 64-bits unsigned intergers (l
) so that the conversion should work as expected.In the end, I'm not really sure how to best fix the issue. In the meantime, we are working with the patched ROOT version at ULB.
Your Environment