scikit-hep / uproot3-methods

Pythonic behaviors for non-I/O related ROOT classes.
BSD 3-Clause "New" or "Revised" License
21 stars 28 forks source link

Add TParameter class #81

Closed beojan closed 4 years ago

beojan commented 4 years ago

Add ability to read TParameters without resorting to peaking at private members.

jpivarski commented 4 years ago

Before accepting this, we'll have to add logic to hasmethods that equates "TParameter" with "TParameter_3c_.*_3e_". We'd want a class name without template specialization to count as the same class name with any template specialization (_3c_ is mangled < and _3e_ is mangled >; anything not [A-Za-z0-9] are hex-ascii between underscores).

It would go here:

https://github.com/scikit-hep/uproot-methods/blob/2e2089748d4ece2a573cd6fa8cb84782ebb8cb15/uproot_methods/classes/__init__.py#L5-L12

and the verification that this is working would be that Uproot picks up your code, producing TParameter objects that are subclasses of your TParameter.Methods. Right now, it shouldn't be because the class name isn't exactly equal to your new module name. We can make this change entirely in uproot-methods. (That was the design; to decouple uproot-methods versioning from uproot versioning.)

You could do that or wait for me to cycle around to it.

beojan commented 4 years ago

Done. This now reads my test file correctly.

jpivarski commented 4 years ago

I was writing a detailed solution to a problem that I expected when you said

Done. This now reads my test file correctly.

I don't know how it works in your case because of the problem below. Can you elaborate on your test?


Ah, crud: it won't work without a change in Uproot. Uproot checks hasmethods to see if the methods exist, but doesn't take the shortened name when it wants to load it:

https://github.com/scikit-hep/uproot/blob/773b6ef0eb96c03b355897b5e92aee4b9a98c37a/uproot/rootio.py#L925-L926

Uproot will still be trying to inherit from uproot_methods.classes.TParameter_3c_double_3e_.Methods, which doesn't exist. To do this "right," we'd have to change Uproot to say to uproot-methods, "I have a "TParameter_3c_double_3e_", what is the most specialized class you have for it?" and uproot-methods would respond, ""TParameter"".

The problem with that is that we'd have to update both uproot-methods and Uproot together, which means coordinating new minor version numbers with synchronized updates to both and setting the pip dependencies with the right restrictions. That happens in several places only because there's documentation that has to agree.

Alternatively, we could do it in a slightly "hacky" way, but one that I'm okay with if you are. In the elif clause that you just introduced, if bare_name in hasmethods.loaders, then we know that there isn't an exact match but there is a bare name match. At this point, we could create a new module and assign it in the right place so that Uproot finds it (Uproot always asks hasmethods before attempting to inherit from a mixin):

# name is "TParameter_3c_double_3e_" in this case
setattr(uproot_methods.classes, name, types.ModuleType(name))
getattr(uproot_methods.classes, name).Methods = getattr(uproot_methods.classes, bare_name).Methods
if hasattr(getattr(uproot_methods.classes, name), "ArrayMethods"):
    getattr(uproot_methods.classes, name).ArrayMethods = getattr(uproot_methods.classes, bare_name).ArrayMethods
jpivarski commented 4 years ago

When you get one of your TParameter objects as tparam, can you paste here the result of

print(tparam._pycode)

so that we can see the class definition it generated? I'm particularly interested in what its superclasses are. I expected its superclass to be uproot_methods.classes.TParameter_3c_double_3e.Methods, which doesn't exist, but if it's working for you, it must somehow be uproot_methods.classes.TParameter.Methods, which does exist. I don't know how it would get the latter because

https://github.com/scikit-hep/uproot/blob/773b6ef0eb96c03b355897b5e92aee4b9a98c37a/uproot/rootio.py#L925-L926

beojan commented 4 years ago
class TParameter_3c_double_3e_(uproot_methods.classes.TParameter_3c_double_3e_.Methods, TObject):
    _methods = uproot_methods.classes.TParameter_3c_double_3e_.Methods
    _bases = [TObject]
    @classmethod
    def _recarray(cls):
        out = []
        out.append((' cnt', 'u4'))
        out.append((' vers', 'u2'))
        for base in cls._bases:
            out.extend(base._recarray())
        out.extend(TString._recarray())
        out.append(('fVal', numpy.dtype('>f8')))
        return out
    _fields = ['fName', 'fVal']
    _classname = b'TParameter<double>'
    classname = 'TParameter<double>'
    _versions = versions
    _classversion = 2
    _hasreadobjany = False
    @classmethod
    def _readinto(cls, self, source, cursor, context, parent, asclass=None):
        start, cnt, classversion = _startcheck(source, cursor)
        startendcheck = True
        if cls._classversion != classversion:
            cursor.index = start
            if classversion in cls._versions:
                return cls._versions[classversion]._readinto(self, source, cursor, context, parent)
            elif cnt is None:
                startendcheck = False
            else:
                return Undefined.read(source, cursor, context, parent, cls.__name__)
        TObject._readinto(self, source, cursor, context, parent)
        self._fName = TString.read(source, cursor, context, self)
        self._fVal = cursor.field(source, cls._format1)
        if startendcheck:
            if self.__class__.__name__ == cls.__name__:
                self.__class__ = cls._versions[classversion]
            try:
                _endcheck(start, cursor, cnt)
            except ValueError:
                cursor.index = start
                return Undefined.read(source, cursor, context, parent, cls.__name__)
        return self
    _format1 = struct.Struct('>d')
    _int32 = struct.Struct('>I')

Evidently, it's creating that module automatically already. I think the line doing this is

globals()[name] = hasmethods.loaders[bare_name].load_module(bare_name)

because the assignment is to globals()[name] not globals()[bare_name].

jpivarski commented 4 years ago

Ah, beautiful! That effectively does what I was describing a manual procedure to do, but in a better way: it creates a new uproot_methods.classes.TParameters_3c_double_3e_ reference pointing to uproot_methods.classes.TParameters without creating a new module. (It's a new reference to the existing module, which is more lightweight and certain to get all contents than creating a new module and filling it, as I was starting to suggest.)

That's an unexpected consequence of code that was written to delay loading these modules until necessary. (I didn't want Python versions of the whole ROOT ecosystem to be defined whenever Uproot imports; I wanted only the classes used in the ROOT files you have open to be defined.) By setting globals()[name] to the result of load_module(bare_name), it also adds a name for each template specialization defined by your open ROOT files that don't also have specialized Python definitions (because we entered your elif).

So this is, intentionally or unintentionally, a great solution to the problem and I'll merge it. As a final sanity check, can you show me that attempting to

import uproot_methods.classes
import uproot_methods.classes.TParameter_3c_double_3e_

fails before opening your ROOT file and succeeds after opening your ROOT file? You also shouldn't have to extract the TParameter: the assignment should happen when the ROOT file is opened, not when the relevant objects are pulled out of it.

beojan commented 4 years ago

image

jpivarski commented 4 years ago

Great! Thanks!