PaperMC / Paper

The most widely used, high performance Minecraft server that aims to fix gameplay and mechanics inconsistencies
https://papermc.io/
Other
9.74k stars 2.27k forks source link

Add Item serialization as json api #11235

Closed masmc05 closed 3 weeks ago

masmc05 commented 1 month ago

Example output: https://pastes.dev/nXEnpQ8K3Q

Since 1.21, vanilla item stack codecs now fully support such a json format for deserialization without any possible data loss, the support was added for data packs.

This pr adds a possibility to deserialize and serialize to such format which will not have any possible data loss due to loss of precise types (like ints becoming bytes) by allowing serialization of custom data as snbt. Also, those api methods will process the items trough data converters.

Unlike ItemStack#serializeAsBytes, this will result in a much more human readable format, which also may be transformed to many other formats like yaml or used with libraries like Jackson for human interaction with the final result.

An alternative to this pr would be adding an api to serialize the ItemStack to snbt, but if the snbt is put in a yaml or json file, it will not receive any kind of support from any IDE while editing the snbt, and because of that will become hard to work with. From just only api sadly it will not be possible to fix this.

Machine-Maker commented 1 month ago

Without putting too much thought into this... I wonder if it's better to expose a gson TypeAdapter or serialize/deserialize pair for people to use within their own Gson setups.

masmc05 commented 1 month ago

I see lots of usages for both, so maybe having both the methods and a type adapter could be great

FireInstall commented 1 month ago

In the ItemStack class, there should be a javadoc note at #serialize() and #deserialize() to stop using it and instead point to this in case human readability is needed or #serializeAsBytes()and #deserializeBytes() if not. (Can it even be deprecated, as there is a better alternative now?) Also, a @see to link both byte and json methods together. Just to reduce confusion.

masmc05 commented 1 month ago

Idk if it would be a good idea to do that, there was mentioned many times by paper devs and core members that they plan to change those to serialize to an snbt string. I think I'll just add a mention there about this new method so developers that are also interested in this will be able to find it easier

FireInstall commented 1 month ago

Well, there is no harm to remove it, if the serialisation get reworked to snbt. But for now there is no argument to be made why anyone should still use the bukkit serialisation.

masmc05 commented 1 month ago

Could someone please review this?

Machine-Maker commented 1 month ago

@masmc05 I think that the "human-readable"-ness of this is really not that useful, especially when this would be a data-versioned format. Say you serialize an ItemStack on this version and it's got whatever version, say 1000. Then, in 3 years you go to deserialize it, but you want to change it first. 3 years from now, the format for itemstacks may have changed drastically, but you have to go back and know exactly what the format is for your specific data version. That information isn't readily available. I think it being human readable is really not worth much. Do you have a specific reason why human readable would be good?

masmc05 commented 1 month ago

Well, the possibility to change itemstacks in configs is often useful, starting with the possibility to be able to manually write them without the usage of a command, GUI etc so to save an item from the game. This helps in developing small plugins where those items won't be changed often. As I said here the IDE's help with the syntax will be very useful for writing those. And this usefulness goes up to the fact that it's easier to batch change the items in configs, so you could make a change up to a hundred of items at the same time, which wouldn't be possible with byte format

Also thought about supporting items without data version, assuming the newest version, but that could lead to not the best practices. Since it's just json, a data version can be added by the plugin, but then it could understand that the item will need to be resaved so it future it could be converted, while otherwise this may have been ignored very often, resulting in future mess

masmc05 commented 1 month ago

The current alternative for this is either bukkit serialization and custom format, first being often broken between versions and pretty hard to write, second being often extremely limited, and very not that useful for small plugins

electronicboy commented 1 month ago

I do wish that there was a "human readable" standard agreed upon, but, json tangengially datapack related stuff is IMHO, not it. I do understand the arguments for having it in favour of people who want to bite that bullet to deal with, but, as said, it's a raw low level type of deal which can/will change often without warning, datapacks themselves already break once or twice a year as-is.

snbt based serialisation in configs is one of those areas of, yea, sure, you can tweak SNBT, but it's still a fairly low-level representation of the data stored.

There is never going to be a real good, long term solution for dealing with representing consistently evolving data. you could agree on the forefront, but, that would be library material, not built-in API type of stuff.

masmc05 commented 1 month ago

I do understand the arguments for having it in favour of people who want to bite that bullet to deal with

Yeah, exactly, that's why the scope of this pr is not to change something existing, people using this will have to understand that it's version dependent format, not a great solution for a big plugin like DeluxeMenu, but still great for some plugins where it's not worth having a whole library to deal with this. This is not a universal solution for each use case

datapacks themselves already break once or twice a year as-is.

This format is based more on codecs than datapacks, just mojang expanded the capabilities of that codec for datapacks, which this format uses. Since how universally codecs are used for both json and nbt in Minecraft, I don't think it's going to be breaking soon. And usually when Mojang makes breaking changes for datapacks, it's rare for it to be more restricting without a new alternative, just may be something different, meaning that the possibility for itemstacks to be valid in json format should remain

lynxplay commented 1 month ago

I am also sadly not that big of a fan of JSON for this. The two arguments your initial comment pointed out were that a) it makes it easy to convert the output into other formats, like yaml b) it has better editor support

Argument b imo is rather irrelevant for a PR like this, especially when editing the output is something, as we got to above, is only for those that want to bite the bullet anyway. If the "normal" usecase for this is read-only, SNBT should be as easily "read" by users as json to understand if the item stored somewhere is the right one.

Argument a makes a bit more sense to me, json is obviously a lot more wide spread than SNBT and hence tools like jq or whatever else can parse the output, whereas snbt presumably lacks these tools. How much this helps anyone tho is kinda up in the air because, again, the internal json structure may change from one version to the next with 0 warning and 0 ability of ours to preserve anything. Which then means that external tools would need to be version specific to find values of interest in the json.

Last but not least, the downsides of JSON are also to be considered. 1) It cannot fully represent the entire item stack. CustomData will never be JSON representable due to the missing type information. PDC tried this for yaml, it failed miserably, now we are at SNBT, just like this PR. However the fact that the data is not fully representable in JSON makes argument a even less convenient as external tools may still need to parse snbt.

2) Yea we hope that this is mostly based on codecs so it should not change anytime soon, but a json representation will always come with more uncertainty than the existing byte blob methods because the proposed methods attempt to achieve more than the byte counterparts. The byte methods only value lies in the fact that it can safely store the item and read it in this and any newer version. The json methods value is additionally directly related to how readable (parseable) and stable the json schema of the returned item is, in fact the json's stability and readability are the only reason this PR exists in the first place.

Now obviously listing downsides is easy, fixing them is rough and I sadly don't have much more to add than cat. SNBT at least yields a fully consistent format for itemstacks and does not require string-ifying parts of its data. However it, just like the json, suffers from instability.

For most usecases, emitting the specific data external tools are interested in sounds a lot more stable than passing a whole json string of undefined format. For actually writable formats, I don't think we will ever get around a custom format implemented by some library that abstracts the item stack data into a separate format that can be implemented on each version respectively.

Malfrador commented 1 month ago

I don't have a strong opinion on JSON or SNBT, both seem fine to me. I generally believe such a system would be very useful.

But I don't really think the format changing between versions is as much of an issue. One of the main use cases of such a PR, at least for me, would be purely reading items from configs. Being able to write an item definition inside a config is preferable over long and unwieldy commands to create such items in-game in my opinion, those get really confusing and too long to enter in the chat box very fast. It also allows for easier version control than base64 encoded strings from serialized in-game items. The Bukkit format sucks, and doesn't nicely represent the game anymore. So having some replacement, be it SNBT or JSON, for this usecase would be nice.

And versioning really doesn't matter here that much - it is kinda expected that if Minecraft changes something my items use, I will have to update my config file with those items. That's the same thing that would happen with command blocks, or data packs and is fine imo.

masmc05 commented 1 month ago

a lot more stable than passing a whole json string of undefined format

I think we went into a bit of miscommunication here, since I didn't provide any example usage of this, using this api to create json strings to write those to the config is indeed dumb and a lot worse than the snbt strings, so I think first of all I should provide the example before saying something else, ofc this api still can be used in many other (and most probably better) ways

This is a quickly written small plugin fragment which just adds an extremely simple menu item, it takes advantage of the "which also may be transformed to many other formats like yaml", here is the full class

    private final Gson gson = new GsonBuilder()
        .disableHtmlEscaping()
        .setObjectToNumberStrategy(ToNumberPolicy.LONG_OR_DOUBLE) //Not necessary, but more pretty output
        .registerTypeHierarchyAdapter(ItemStack.class, ItemStackAdapter.itemStackAdapter())
        .registerTypeHierarchyAdapter(ConfigurationSection.class,
            (JsonSerializer<ConfigurationSection>) (src, type, ctx) -> ctx.serialize(src.getValues(false)))
        .create();

    record Config(ItemStack menuItem, String menuCommand /*Other data*/) {
        public Config { /*Precondition checks*/}
    }
    private Config config;

    private void readConfig() {
        this.saveDefaultConfig();
        this.config = gson.fromJson(gson.toJsonTree(this.getConfig()), Config.class);
        this.writeConfig(); // Save in case something changed, like if items were in an older version, including in the default config with examples
        // This will also always keep in the example item the up-to-date data version
    }

    private void writeConfig() {
        Map<?, ?> values = gson.fromJson(gson.toJsonTree(this.config), Map.class);
        values.forEach((key, value) -> this.getConfig().set((String) key, value));
        this.saveConfig();
    }

The default config being (not recommended to miss namespace etc in default to ensure proper upgrades, but this is also for showing user defined item)

# The item given to player on first login which triggers the command
menuItem:
  id: compass
  DataVersion: 3955
  components:
    custom_name: '{"text":"Test name", "italic":false}'

# The command executed when player clicks the menu
menuCommand: "dm open serverMenu"

And the copied default config being

# The item given to player on first login which triggers the command
menuItem:
  id: minecraft:compass
  count: 1
  components:
    minecraft:custom_name: '{"italic":false,"text":"Test name"}'
  DataVersion: 3955

# The command executed when player clicks the menu
menuCommand: dm open serverMenu

Which is still easy to modify, the data which can be used can always be found on Minecraft wiki and the IDE will help with yaml syntax errors. Also, the second will be autoupdated by the plugin when the server is updated to a newer version, no need to worry what format was in that version

masmc05 commented 1 month ago

Argument a makes a bit more sense to me, json is obviously a lot more wide spread than SNBT and hence tools like jq or whatever else can parse the output

The message of the argument wasn't this, I didn't mean it can be interacted with more easily by external tool, as you mentioned, it's unstable data, this will for sure become a nightmare. I indeed meant that the relaxed data types of json (only string, number, object, array, boolean, null) means that no matter to which format it will be transformed (yaml, toml etc) no data loss will occur, so this can be easily used in any kind of config, using the proper syntax and IDE support. This misunderstanding may have lead to considering argument b useless, because yes, a simple json string in a random yaml config wouldn't bring better IDE support

masmc05 commented 4 weeks ago

Maybe I could add Experimental annotation, and in case this will be in danger of breaking soon or people will use this wrong etc, this could be quickly deprecated and removed? Seems like this just got stuck on different opinions if it's ok to add

The main points against that were:

Say you serialize an ItemStack on this version and it's got whatever version, say 1000. Then, in 3 years you go to deserialize it, but you want to change it first. 3 years from now, the format for itemstacks may have changed drastically, but you have to go back and know exactly what the format is for your specific data version. That information isn't readily available.

What if you have paper config in a format. After 3 years the format was changed, on docs you of course updated them, and you want to change the config before the server start? Same situation, no info available on that old format, because such situation was never supported before. Same as the plugin having the item in config changed the format, no plugin just randomly have the info on a 3 years old format.

There is never going to be a real good, long term solution for dealing with representing consistently evolving data.

But at least this provides the almost same format that mojang shows to end users, and for which there are lots of wiki etc, and which can be properly integrated in configs and not a big ugly string. As stated before, if someone wants to have the item in config very often, like DeluxeMenu, for them it's indeed better to be dependent on a custom format or make their own, not this.

It cannot fully represent the entire item stack.

Only to blame non-mojang data, like PersistentDataContainer in entities or people storing data in minecraft:custom_data component. I've checked and since 1.8.8 the only Mojang data that could have broken from NBT -> Json -> NBT (as implemented in current version in NbtOps and JsonOps) was uuid stored in UUIDMost and UUIDLeast format (was fixed once they started storing it in an array, many many years ago), which aren't normally even stored in items.

That means that for more than 10 years, if not non-mojang data, json could easily fully represent an item. Item components finally isolated the non-mojang data, so the most can easily be in json. And don't think that after 10 years of supporting that and explicitly adding support for proper snbt definition of non-mojang data, there is a risk of suddenly breaking this.

masmc05 commented 3 weeks ago

Applied @lynxplay's requests from vc