`nbtlib 2.0` - Githubissues

vberlier commented 2 years ago

This tracks a number of issues that should be resolved before making 2.x releases stable. I think the recent feedback shows that nbtlib could benefit from a little revamp. Proper static typing is long overdue and I'd like to take some time to experiment with optimizations to speed everything up a bit.

[ ] #145
[x] #146
[x] #153
[ ] #152
[ ] #150
[ ] #149
[ ] #157
[ ] #158
[ ] #192

Also this could be the perfect occasion to revisit old issues. It would be interesting to revisit the idea of nbtlib.contrib once the main issues have been addressed (#60). Also I'm not sure about the status of the setup.py situation anymore (#54). And finally, I've always struggled with writing documentation. Maybe we can come up with a better strategy, or find a way to break it down and make a proper roadmap for this (#16).

If anything else comes up I'll also add it in here.

vberlier commented 2 years ago

I recently diagnosed a performance problem where serializing a structure file with millions of entries was taking 60 seconds. I'd like to make nbtlib viable when you also have really large workloads so I want to see what's possible by either trying to rely more on numpy, or using cython, or an optional C extension a bit like what they're doing for https://github.com/simplejson/simplejson.

At the same time I also want to keep nbtlib usable in situations where you can't use pre-built wheels, so this needs further investigation.

vberlier commented 2 years ago

There's another type of feedback I got a few times over the years, and it's that the wrappers are annoying, especially for beginners. They expect to be able to use nbtlib like the json module, where you can simply pass plain dicts back and forth and everything works out of the box. This is already somewhat supported if you use schemas but I'd like to see if there's anything that can be done to make things even easier, like automatically generating a matching schema when reading files for instance.

Also for schemas maybe there's a way to leverage or get inspiration from the way pydantic does things. https://github.com/samuelcolvin/pydantic/

vberlier commented 2 years ago

About SNBT support and the nbt path parser in nbtlib, I think they could be improved a little. I'm working on mecha, a full-blown parser and compiler pipeline for Minecraft functions (https://github.com/mcbeet/mecha) and I think I did a better job for the snbt parser over there. The parser in nbtlib can't be embedded in other syntactic constructs and that's why I had to make an alternate version in mecha. I'll probably try to bring it to nbtlib at some point.

MestreLion commented 2 years ago

Just a quick feedback before I have time for more in-depth discussions (right now I and the one busy, lol). And, btw, loved this idea of a discussion issue before 2.0!

I recently diagnosed a performance problem where serializing a structure file with millions of entries was taking 60 seconds. I'd like to make nbtlib viable when you also have really large workloads so I want to see what's possible by either trying to rely more on numpy, or using cython, or an optional C extension a bit like what they're doing for https://github.com/simplejson/simplejson.

Yeah, serialization is really slow. Not that I care about SNBT. But before you think otherwise, know that nbtlib's parsing is fast. Trust me.

My mcworldlib deals with Worlds, so it too deals with millions of entries, considering each mca Region file can have up to 1024 chunks, each one the equivalent of a whole nbtlib.File. And my medium sized World (~500MB) contains ~1300 regions. Just loading them all in a World.get_all_chunks() loop takes more than 2 minutes. So yeah, .parse() performance is really crucial for me.

So I did some benchmarks using several NBT libraries (Amulet-NBT, twoolies's NBT, Tk's pynbt) and to my surprise nbtlib came first among pure-python implementations (closely tied with pynbt). Twoolie is noticeably slower (~ -30%), and Amulet came last (~ -40%). But of course, none can compete with Amulet's Cython implementation, which is 4-5x faster. I'll son post the benchmark script in my library, if you're interested.

Main point is: nbtlib's parse is already fast enough considering the pure python alternatives. Take that into account before trying to optimize it. And don't break it for the sake of SNBT.

At the same time I also want to keep nbtlib usable in situations where you can't use pre-built wheels, so this needs further investigation.

If one can't use wheels (why?), it will build from source, what's the problem?

MestreLion commented 2 years ago

I've also benchmarked serialization alternatives, and man,msgpack is awesome. It was able to serialize a Compound without specifying any handler (something json requires, and I assumed any serializer would too). and it's blazing fast.

MCC Chunk 0.2072 mpack -> <class 'bytes'> (21,875,554) 1.9993 write -> <class 'bytes'> (24,460,279) 3.9964 snbt -> <class 'str'> (31,037,955)

MestreLion commented 2 years ago

There's another type of feedback I got a few times over the years, and it's that the wrappers are annoying, especially for beginners. They expect to be able to use nbtlib like the json module, where you can simply pass plain dicts back and forth and everything works out of the box. This is already somewhat supported if you use schemas but I'd like to see if there's anything that can be done to make things even easier, like automatically generating a matching schema when reading files for instance.

I also think a lot about this, but any solutions I came up require creating a Compound.__setitem__() (or even Base.__setitem__. And I'm always reluctant to go this route as it can have a major hit on usage performance.

The simplest implementation strategy I'm thinking of is something similar to this pseudocode:

def __setitem__(self, key, value) -> None:
    if not isinstance(value, Base) and key in self:  # and isinstance(self[key], Base) too?
        # Overwriting existing key with a non-tag value. Cast to old value type (if old is a tag)
        return super()[key] = self[key].__class__(value)
    super()[key] = value

But that's an 1-2 isinstance() and a key in keys test for every assignment. We need some proper, serious benchmarking before going this route.

Or we can let the user select this "auto-cast" mode per instance:

@property
def auto_cast(self, value: bool):
    return self.autocast_setitem == super().__setitem__

@autocast.setter
def auto_cast(self, value: bool):
    self.__setitem__ = self.autocast_setitem if value else super().__setitem__

This could default to False (so load/parse gets non-casting, fast behavior), and File.load() could set it to True after fully creating/parsing the instance and before returning self to the caller (the same way it sets filename)

vberlier commented 2 years ago

I kinda lose confidence in code the longer it sits around unattended and since the core of the library is now a few years old I'm really itching to do a refresh haha. But I actually didn't know how well nbtlib faired against other implementations, really glad you benchmarked this!

When it comes to performance I'd primarily try to improve parsing and writing raw nbt data, as you mentioned I agree that's the most critical use-case. And I won't unnecessarily break things, I really like how the API feels so it's really more about potentially refactoring some lower-level elements or finding fast paths.

If one can't use wheels (why?), it will build from source, what's the problem?

A lot of windows users have python installed but no Visual C++ so if there aren't pre-built binaries available for them they won't be able to install the package. Also I've seen people interested in running their utilities on mobile or in the browser with Brython and I'm pretty sure C extensions are a massive headache for this.

Also I forgot about mypyc, which I think would actually be the perfect tool to speed up nbtlib since I'm planning to add type annotations anyway. Since it generates a C extension from pure python code, the original source can seamlessly be used as a fallback. I haven't experimented with it yet but I'm really curious. https://mypyc.readthedocs.io/en/latest/index.html

msgpack is awesome

Haha yeah sometimes I kind of wish Mojang wasn't stuck with nbt too!

vberlier commented 2 years ago

The auto_cast thing is kind of clever, but the actual implementation of the setter would probably need to recurse into its children to also make the nested compound tags behave the same. But it's something to think about. Maybe this could happen automatically at the end of __init__. I'll make an issue for this.

MestreLion commented 2 years ago

If one can't use wheels (why?), it will build from source, what's the problem?

so if there aren't pre-built binaries available for them they won't be able to install the package. [...] Also I forgot about mypyc, which I think would actually be the perfect tool to speed up nbtlib since I'm planning to add type annotations anyway. Since it generates a C extension from pure python code, the original source can seamlessly be used as a fallback. I haven't experimented with it yet but I'm really curious. https://mypyc.readthedocs.io/en/latest/index.html

Will take a look, never heard of mypc, sounds awesome. But meanwhile... there's always the Amulet approach:

# in nbt.__init__.py
try:
    from nbt_cy import (  # a cyton nbt.pyx module
        a,
        b,
        ...,
    )
except ImportError:
    from nbt_py import (  # a pure-python nbt_py fallback module
        a,
        b,
        ...
    )

MestreLion commented 2 years ago

I kinda lose confidence in code the longer it sits around unattended and since the core of the library is now a few years old I'm really itching to do a refresh haha.

nbtlib's design is amazing. It's elegant, beautiful, clean and easy to use. I've always disliked the pymclevel approach of tag.value/tag.name, it's really cumbersome to work with. Not to mention TAG_Xxxx names, eww. And many current implementations still inherits that design. Some are only now subclassing dict and allowing things like mycomp["key"] = xxx by translating that into xxx.name = key; mycomp.value.add(xxx) in a Compounds __setitem__!!!. Eww!

But I actually didn't know how well nbtlib faired against other implementations, really glad you benchmarked this! [...] When it comes to performance I'd primarily try to improve parsing and writing raw nbt data, as you mentioned I agree that's the most critical use-case. And I won't unnecessarily break things, I really like how the API feels so [...]

And I'm really glad you are focusing on parsing/raw NBT. But yeah, nbtlib already fairs very very well, and it surprised me too. Usually high-level abstract and elegant APIs have a price. By "breaking" it I mean breaking the performance, not the API. For example, I've always feared adding __setitem__ because of the potentially massive performance hit.

We need good benchmarks on large data. Grab that 3.8MB MCC chunk in my data/ folder, and the 7.8MB r.2.1.mca file (loop it using my Anvil parser or stitch each raw chunk into a massive supercompound). We need some very good tests so we don't rely on possibly wrong assumptions.

vberlier commented 2 years ago

That's really sweet, thanks! I'm gonna make good use of these test files, I'm about to start messing around with mypyc.

vberlier commented 2 years ago

So another thing I'm thinking about for improving .parse() performance is making unpacking (and byteswaps) lazy and using memoryviews to track the position of each tag in the original buffer. This would avoid copies until tags are mutated and potentially put a lot less pressure on the garbage collector. I think it would work well for dependent libraries like mcworldlib because a lot of use-cases just require modifying a few blocks or entities so if we can avoid doing unnecessary work entirely this might lead to even greater performance gains than mypyc or cython. Of course it would also be possible to investigate both.

MestreLion commented 2 years ago

msgpack is awesome

Haha yeah sometimes I kind of wish Mojang wasn't stuck with nbt too!

Minecraft is, but I'm not ;-)

I'm planning on using msgpack as a serializer just to feed it to a hash function (most likely xxh3_128*) so I can use it to determine if a chunk/region was changed or not prior to writing. That way I can avoid adding a __setitem__ and still allow an "auto-dirty" feature.

Of course, this trouble is only worth it if msgpack is considerably faster than nbt's write(). And it is. By a tenfold.

*: man that thing is a beast, it's even faster (and larger) than CPython's native hash()!!! As in 6-8 times faster. And it's 128 bits for the collision OCD like me.

MestreLion commented 2 years ago

So another thing I'm thinking about for improving .parse() performance is making unpacking (and byteswaps) lazy and using memoryviews to track the position of each tag in the original buffer. This would avoid copies until tags are mutated and potentially put a lot less pressure on the garbage collector.

I would surely benefit from it, bounds like a lot of trouble for possibly very little gain. I mean, I was really impressed by Amulet's Cython implementation. It's so fast that it really discouraged me to try any non-trivial optimization on pure-python code.

Too bad Amulet's API is so bad, with its outdated pymclevel heritage (that everyone seems to mimic). I even considered learning Cython just for the sake of making a pyx implementing nbtlib's API. At least a subset of it. Possibly no Path, certainly no Schema, but the single fact of Compounds being dicts, and not having to deal with .value/.name everywhere would be enough for me.

StealthyExpertX commented 1 year ago

Hello, I'm not sure if you still want to work on this or if you took a break, do you have any updates?

vberlier commented 1 year ago

I have a local branch that I revisit from time to time to experiment with some of the stuff mentioned in the thread, but I haven't had the occasion to focus on nbtlib in a while. I'm always working on some other project and when I need to use nbtlib it still does its job pretty well, so in practice the things I want to improve don't really bother me enough to make me drop whatever I'm working on to do the nbtlib revamp.

However I really want to get to it done at some point. It's kind of weighing on me mentally but I really care about doing it right. I think I'll try to set some time aside for nbtlib in september. It's getting kinda ridiculous, I hope I'll be able to close this issue before its second birthday...

vberlier / nbtlib

`nbtlib 2.0` #156