Open MestreLion opened 3 years ago
I suggest 2 possible approaches:
Path.from_parts()
, classmethod that takes a plain tuple* and builds the accessors without invoking the parser, assuming each part is a single, non-empty componentPath()
to take such plain tuples (possibly invoking .from_parts()
, right after the isinstance(path, Path)
check (as Path
is a tuple itself)*: By "tuple" it should actually mean "any sequence which is not an string", so lists and similar iterables are allowed too. anyway, that's a minor detail.
Right, so now I see why you wanted to concatenate ints too. But in this case subscripting is the operation you would want anyway. Assuming __add__
allows ints, you could implement _tree.get_element()
using concatenation like this root + keys[0]
, but _tree.get_element(Path(), ["foo.bar", 2, "spam"])
would result in foo.bar[2].spam
instead of "foo.bar"[2].spam
. It's the kind of subtle error I was thinking about the other day. The problem isn't really that supporting ints in __add__
is ambiguous by itself, but that it behaves the same for operations that are semantically different. To me the bug would be more time-consuming to figure out than a clear TypeError
.
Anyway, I agree that from_parts()
would be pretty useful. You could probably currently implement it yourself with something like this (untested):
path = tuple.__new__(Path, (NamedKey(k) if isinstance(k, str) else ListIndex(k) for k in data.keys[:-1]))
Also a walk()
method would definitely belong in nbtlib
.
but
_tree.get_element(Path(), ["foo.bar", 2, "spam"])
would result infoo.bar[2].spam
instead of"foo.bar"[2].spam
. It's the kind of subtle error I was thinking about the other day.
Yeah, but since it came from a tuple, using a method called .from_parts()
, it's pretty self-evident that "foo.bar"
must indeed be a single component. The tuple/list format makes it clear that each element is a single key. If you want to parse each element, then use Path's constructor or concatenation. .from_parts()
is for adults, it does have a contract/assumption embedded on it for the sake of performance.
The problem isn't really that supporting ints in
__add__
is ambiguous by itself, but that it behaves the same for operations that are semantically different.
The operations semantically are different, I agree, but int
only behaves the same because because of a limitation on integers: they can't "express" more than a single component. Strings and Paths can, so they do when the operation allows.
For example, let's say we hypothetically allow tuples/lists in __add__
too, it could (and maybe it should) mean "parse every element", so Path() + ["foo.bar", 2, "spam"] == Path("foo.bar[2].spam")
. Probably not useful, but still consistent:
Path()
and +
, do some parsing/processing on its input. A single string may become more than one key, so a tuple may become more than its len()
. An int would too, if it could, and Path
keeps its len()
.from_accessors()
of .from_parts()
, are internal tools exposed to the API for the sake of performance. They make assumptions and are more restrictive about their data. They skip the parser.in this scenario, []
indexing (not slicing) is a low-level tool too, as it also skips the parser by contract, assuming the index is always a single element for leaf values such as strings and integers. For a collection such as another Path, it merely loops and adds each part without any processing (the same for a hipotetical tuple. But I'm not suggesting you allow tuples to +
or []
or even ()
, ok? Just using it as a mental exercise for testing API consistency. I'm more than happy with just .from_parts()
Anyway, I agree that
from_parts()
would be pretty useful. You could probably currently implement it yourself with something like this (untested):path = tuple.__new__(Path, (NamedKey(k) if isinstance(k, str) else ListIndex(k) for k in data.keys[:-1]))
Hum, didn't know about Path
's internals to realize it would be so simple. I'll take a look at it and test to craft an actual PR.
Also a
walk()
method would definitely belong innbtlib
.
Feel free to grab my tree.py
, or some of its usage in my nbt
(a wrapper for nbtlib
). It's a new addition that came out of some experiments in generic crawlers I'm particularly proud of. Still WIP, soon I'll add a useless os.walk()
wrapper using it just to test how flexible it is. print_tree
is cute, take a look at its usage in nbt.nbt_explorer()
Yup sorry if this wasn't really clear, the first paragraph wasn't really related to from_parts()
or anything. It was just a tangent about the error you could have run into if you tried implementing your get_element
function with concatenation rather than indexing. I was just trying to say that the canonical operations for extending paths are path + path
and path[key]
, and that for concatenation the operands can be subjected to auto-conversion that can make it look like they do the same thing.
I'm completely on board with from_parts
. I'm gonna look into your patches in more details but this looks really neat!
Yup sorry if this wasn't really clear, the first paragraph wasn't really related to
from_parts()
or anything. It was just a tangent [...]
Nah, you were clear enough I could infer you approved .from_parts()
and we were just discussing a side-track.
for concatenation the operands can be subjected to auto-conversion that can make it look like they do the same thing.
Yeah, I carefully chose to use indexing instead of concatenation in get_element
because of that difference. I was aware since you told me in #146 . The need for .from_parts()
is just to avoid the looping/recursion (or perform it in Path), as I was sure Path had a better way to craft itself out of a tuple.
I'm completely on board with
from_parts
. I'm gonna look into your patches in more details but this looks really neat!
Well... I didn't PR this one yet... will do so in the next days
Right, makes sense!
Well... I didn't PR this one yet... will do so in the next days
By "patches" I was actually referring to your nbt.py
file containing your extra utilities and where you monkey-match a few nbtlib
things.
I have this NBT Explorer-like printing and when re-implementing it using my generic tree walker I noticed a missing feature in
Path
that forced me to use a really inefficient constructor:The generic tree walker yields a generic
tuple
containing each path component, similar to thepathlib.Path()
API. And to create annbtlib.Path
from this tuple I had to resort to an expensive that iteratively (or recursively) creates a newPath
for each component by concatenating the next component (def get_element(root, keys): return get_element(root[keys[0]], keys[1:]
). (by the way, this concatenation is what led me to #146 , as each part can be anint
(forList
) orstr
(forCompound
)I'm sure there are better, or at least easier ways of doing this. Something like
Path.from_accessors()
, but skipping the parser by assuming each part is either anint
or astr
representing a single key.