owengage / fastnbt

Fast serde serializer and deserializer for Minecraft's NBT and Anvil formats
MIT License
186 stars 34 forks source link

Fix: stream parser errors on Minecraft Unicode #35

Closed DuckDuckWhale closed 3 years ago

DuckDuckWhale commented 3 years ago

When parsing a size prefixed string, the stream parser attempts to parse the string as UTF-8 via the standard library, which errors on Minecraft's CESU-8 (Java variant) encoded string. This also causes attempts to skip over a compound containing a string to fail. 0be03dc1877c0e19810f96d9f7e2ac728ad6c64c has fixed this in the Serde parser but not the stream parser.

Unrelated: I think it's a good idea to change the error from a string to a custom error type (by using thiserror) so that users can check what error has occurred without relying on the content of the string. What do you think?

owengage commented 3 years ago

Ah excellent! I completely glossed over the older no-serde parser when I introduced the cesu8 strings. Thank you.

Regarding the error thing, it's something I can't change without a major version bump. For the serde one I don't have specific errors like that because if it does happen there's not really much you can do about it. I suppose with this parser though you could choose to continue parsing if you hit an error like this.

I'll definitely keep it in mind if I want other breaking changes!

owengage commented 3 years ago

Released 1.1.1 with this change. :)

owengage commented 3 years ago

Looks like I didn't expose the internal string in the public API, so I can probably make this change without breaking any backwards compatibility. I will look into it at some point!

Opened #36

DuckDuckWhale commented 3 years ago

Awesome!