conqp / mcipc

A Minecraft server inter-process communication library.
GNU General Public License v3.0
81 stars 11 forks source link

Parsing NBT #40

Open gilesknap opened 2 years ago

gilesknap commented 2 years ago

I'm opening this issue to discuss NBT parsing.

(I note that NBTs are a Java edition feature and I'm not familiar with how similar information is handled in other editions)

I have implemented a simple deserializer for Stringified Named Binary Tag data which is the format returned by commands like data get

# extract preamble from string responses to commands (benign for raw SNBT)
preamble_re = re.compile(r"[^\[{]*(.*)")
# extract list type identifiers
list_types_re = re.compile(r"[LBI];")
# regex to extract all unquoted items
unquoted_re = re.compile(r'([-.A-Za-z0-9]+)(?=([^"]*"[^"]*")*[^"]*$)')
# regex to extract numeric values
integers_re = re.compile(r'"(\d+)[bsl]?"')
no_decimal_floats_re = re.compile(r'"([0-9]+)[fd]"')
floats_re = re.compile(r'"(\d+.\d+)[fd]"')

def parse_nbt(snbt_text: str) -> object:
    """
    Naive deserialization of an SNBT string into a object graph of Python types.

    Note that this is one way only since the following details are lost:
    - distinction between byte, short, int long, types (suffixes of b,s,none,l)
    - distinction between float, double types (suffixes of f,d)
    - distinction between SNBT and raw JSON (enclosed in single quotes)

    See https://minecraft.fandom.com/wiki/NBT_format
    """
    text = preamble_re.sub(r"\1", snbt_text)
    text = list_types_re.sub(r"", text)
    text = unquoted_re.sub(r'"\1"', text).replace("'", "")
    text = no_decimal_floats_re.sub(r"\1.0", text)
    text = floats_re.sub(r"\1", text)
    text = integers_re.sub(r"\1", text)
    text = text.replace('"true"', '"True"').replace('"false"', '"False"')

    return json.loads(text)

I'm not sure the above approach is worthy of the nicely typed mcipc library.

There is a lot more work to do to make a serializable NBT class in python. A useful NBT class would need to:

This would mean you could do something like this:

# increase the number of items in slot 0 of the chest at 626, 73, -1654
nbt = client.data.get(block=Vec3(626, 73, -1654)) 
nbt.Items[0].Count += 10
client.data.merge(block=Vec3(626, 73, -1654), nbt)

So is this worth implementing? The nbt serialize would be limited to the following commands that I can think of:

Wheras the dumb deserialize specified above is useful for querying information about Players, Mobs, Entities, Block Entities etc.

gilesknap commented 2 years ago

Out of interest, here is a function in MCIWB using the parser

https://github.com/gilesknap/mciwb/blob/dev/src/demo/arrows.py

Note that it quite easily enables extraction of of a position from the NBT. Also note that this code creates an NBT to send in the data get but its trivial enough that having a serializer would not have added a great deal.

gilesknap commented 2 years ago

UPDATE:

Some good news on this. When merging NBT data the command is happy to take any number for numeric types and cast them appropriately.

So this means at present the following code adds 10 eatra items to slot 0 of a chest

In [58]: nbt = parse_nbt(c.data.get(block=Vec3(625, 73, -1646)))

In [59]: nbt['Items'][0]["Count"] += 10

In [60]: c.data.merge(block=Vec3(625, 73, -1646),nbt=str(nbt))
Out[60]: 'Modified block data of 625, 73, -1646'

This appears to mean that the only special handling for serialization is the quoted JSON snippets.