Closed maxnoe closed 4 years ago
It seems to replace underscores with _5f_
C++ class names can include more characters than Python class names can, particularly ::
for namespaces and <
>
for templates, so they have to be encoded. They're encoded by replacing the non-alphanumeric characters to hex equivalents, separated from the rest of the name by underscores. We don't want an underscore by itself to be confused with the beginning of an encoding, so underscores are converted as well (5f is hex for underscore's ASCII value).
That's for class names, but you're talking about class members, which have the same character restrictions as Python—why convert those? Well, C++ names (both for classes and members) also have to be converted from ASCII to Unicode: Python 3 names are Unicode. For simplicity, Uproot 3 had only one "make names okay for Python" function, which did both encodings.
Uproot 4 still has the encoding for C++ class names (and it's better formalized, with explicit, public functions for translation), but C++ class members are now isolated from Python class attributes by putting them in an entirely different namespace:
pythonized_class._num_5f_showers
pythonized_class.member("num_showers")
(Similarly, when looking up classes by name, one uses the C++ name in quotes.)
So, this wasn't a bug—it was intentional—but perhaps surprising. I'm hoping that the new method is less surprising.
Yes, using uproot4 and e.g. the all_members
attribute works fine.
So I guess I am now looking forward to the 4.0 release more than ever...
It's not a future thing—you can use it now. It had two features missing: documentation and file-writing. Last week, I finished adding docstrings to all modules, classes, functions, and methods; what remains is to put them all on the web with readthedocs—other things have been keeping me from that last step. File-writing will be a little longer, and perhaps I should switch PyPI names (uproot4
→ uproot
, uproot
→ uproot3
) before that, with an API in place that says, "For file-writing, import uproot3. Sorry!" so that more people try it out.
I'm hoping that the new method is less surprising.
The new method is definitely more user-friendly when debugging!
Offtopic: I collected a few caveats during my uproot3 -> uproot4 transition (still ongoing), so if you want, I can wrap them up and they could be dumped into a special section in the docs ("Migrating from uproot3"?)
Since wikis aren't where people look for documentation, I converted the Awkward wiki into a "collecting thoughts on a Awkward 0 → Awkward 1 cheat sheet/migration guide".
It makes sense to do the same for Uproot: I just created a wiki here, too.
These can be totally disorganized; I'll tidy them up when converting them to user-facing documentation. Short code examples are good, but not long ones or ones that depend on anything hard to reproduce. scikit-hep-datadata is a good source of world-accessible files.
Oh, and I'll be closing this because it's not a bug. (I'll admit it does look suspicious; thanks for the feedback, @MaxNoe!)
Reading a root object, defined here:
https://github.com/Eventdisplay/Eventdisplay/blob/master/inc/VMonteCarloRunHeader.h
uproot is able to read the object and the data therein but it inserts seemingly random
5f
s into some of the attribute names:Is this intendend? Why does this only appear in some members?
Unfortunately the only files I have are quite large