Closed Schuwi closed 4 years ago
Oh, yeah kinda forgot to run cargo test
, heh...
I’m very happy to see work on this. It looks like the failures are due to handling of bested lists; do you think it’s possible to handle that with your current approach? Or does the serializer not have enough information at that point?
Okay, the tests (including array serialization tests) should now complete successfully, though it just dawned on me that this whole approach is flawed. Empty arrays can't be identified on serialization because no elements are written to identify the element type with. So every empty sequence is serialized as an empty list right now...
Alright. Here is how it looks:
#[serde(serialize_with="hematite_nbt::xxxx")]
but doesn't really help us because this serialize_with function (see https://serde.rs/field-attrs.html) only receives a generic S: Serializer
parameter. This doesn't allow for any "custom" communication to the specific hematite_nbt Serializer to tell it to serialize an i8/i32/i64-NBT-array next.The current implementation of serde isn't designed to allow for explicitely typed sequences and this is unlikely to change in the future (see the wontfix tag on https://github.com/serde-rs/serde/issues/607).
Using a dirty (not-so-)little hack.
As NBT doesn't support tuples, our Serializer::serialize_tuple_struct
functions are currently "unused" (i.e. they just throw an error upon invocation). This method has a very interesting signature:
fn serialize_tuple_struct(self, name: &'static str, len: usize) -> Result<Self::SerializeTupleStruct, Self::Error>
which is basically the same signature as Serializer::serialize_seq
plus an extra &'static str
(and usize
instead of Option<usize>
but our sequences are always sized so that distinction doesn't matter).
Now my idea is to (ab)use this method by implementing three serialize_with functions (i8_array
, i32_array
and i64_array
) which are defined generically for their corresponding IntoIterator<ixx>
doing something like:
let seq = serializer.serialize_tuple_struct("i8", iter.size_hint().1.unwrap());
for elem in iter {
seq.serialize_field(&elem);
}
seq.end();
Thus the serialize_tuple_struct
method can check the value of the name parameter and serialize an array according to the type.
Surprisingly few actually. The serialize_tuple_struct
implementation can't be called "accidentally" (via the #[derive(Serialize)]
macro) because i8
, i32
and i64
are reserved keywords and thus can't be used as struct names.
The only downside I can think of right now:
#[derive(Serialize)]
'd data structure, because the i8/i32/i64-array fields tagged as such will be serialized as tuple structs, if even supported by the used serializer.I'll try to implement this and see how it works out. It should actually be less work than the current approach but optimistic assessments have been shown to be incorrect in the past.
I’m not sure how empty lists are serialized by Minecraft, it may not matter what tag is set in those cases. Do you know it generated data that could confirm this?
I’m not sure how empty lists are serialized by Minecraft, it may not matter what tag is set in those cases. Do you know it generated data that could confirm this?
Looking at some 1.13.2 vanilla chunk data right now I can confirm I have found an empty long array (in this case Chunk:Level/Structures/References/Monument
). Though I have no idea whether the vanilla client would accept an empty List here instead.
Edit: Manually changing this exact empty array to an empty List and loading the chunk in the vanilla 1.13.2 client resulted in no unusual console output and inspecting the chunk data afterwards the value in question was back to an empty long array.
I’m not sure how empty lists are serialized by Minecraft, it may not matter what tag is set in those cases. Do you know it generated data that could confirm this?
In most cases using an empty list should be fine, since from my tests vanilla will ignore a tag if it has the wrong type. In @Schuwi's Monument
case above, the client expected a long array but found a list, so it skipped properly deserializing that field without throwing any errors. (This can be observed by, for example, changing the type of the Palette
tag for chunk data—vanilla will just skip deserializing that chunk section.)
It's possible that not all vanilla deserializers will ignore errors in this manner, so I would advise caution using the empty TAG_List approach. Though it will work in all cases I've tried, it could cause bugs in the future.
Also, most of the other NBT libraries I've looked at won't be able to interpret an empty list as an empty TAG_Long_Array or TAG_Int_Array. So going with this approach might break compatibility with some non-vanilla implementations.
The
serialize_tuple_struct
implementation can't be called "accidentally" (via the#[derive(Serialize)]
macro) becausei8
,i32
andi64
are reserved keywords and thus can't be used as struct names.
Note that the r#
syntax can allow a keyword to be used as an identifier. For example, this compiles and will have the struct named i32
:
struct r#i32 {}
I doubt anyone will name their types like this, but to be sure you might want to use a more obscure special value, such as __hematite_nbt_i32_array__
.
Alright, I implemented my proposed serialize_with
approach (https://github.com/Schuwi/hematite_nbt/tree/explicit-array-serializing, working implementation) but I bumped into a new problem with that approach.
serialize_with
can't be used on nested collections. That's not really our fault, but rather a limitation of serde which has an open issue (https://github.com/serde-rs/serde/issues/723) but it doesn't look like any solution will be implemented soon.
Other than that I find this approach still quite promising, so I'd be happy if anyone can come up with a workaround for this limitation. I haven't really found one yet...
The serialize_with
workaround seems like a clever solution to me, it's unfortunate there is a roadblock. When I originally implemented this stuff, I did encounter many of these limitations of serde myself. I think this is ultimately because NBT does not map perfectly to the serde data model, so we're just stuck trying to find reasonable workarounds.
Alright, I fixed up the code and implemented the nested lists using a workaround. I would propose to merge it for now (other branch, will make a new PR shortly). I think this approach is the way to go and the rest (proper nested lists support) is up to serde to implement. Also, as it is entirely optional, nobody is forced to use it and it breaks no existing code.
Closed in favor of #52
This PR aims to implement serde serialization support to serialize i8, i32 and i64 sequence types to NBT ByteArray, IntegerArray and LongArray respectively. This implementation checks the type of the first sequence element to determine which NBT List/Array type to use. This means of course that serializing to a NBT List of NBT Byte/Integer/Long types is not possible anymore, but as discussed in #27 this doesn't seem to be used in practice. Also if required this could probably be implemented as some kind of option later.
I ran some rudimentary tests by serializing a Minecraft chunk with https://github.com/feather-rs/feather including Lists and LongArrays and checking the result in an NBT viewer. Everything serialized as expected.
This would resolve #27 and #32.