ForeverZer0 / SharpNBT

A pure CLS-compliant C# implementation of the Named Binary Tag (NBT) format specification commonly used with Minecraft applications, allowing easy reading/writing streams and serialization to other formats.
MIT License
25 stars 9 forks source link

Are bytes in NBT signed or unsigned? #9

Closed TheVeryStarlk closed 1 year ago

TheVeryStarlk commented 1 year ago

SharpNBT uses byte, but the wiki mentions that it is a signed byte https://wiki.vg/NBT.

ForeverZer0 commented 1 year ago

They use the unsigned byte in this library, as .NET supports unsigned types, unlike Java under which the specification was designed.

In practice, NBT uses this type to denote "raw bytes" of data, and much less for its signed numerical range, which is analogous to how a byte in .NET is used, such as memory buffers. Using an sbyte for this would simply make it awkward to use in the .NET runtime, which uses byte for this case consistently across all APIs.

This was an intentional design choice, as sbyte is also not a CLS- compliant type, and would exclude this library from being consumed by .NET languages other than C#. If your data is actually using the integral value and requires a signed range, it will need to be cast, reinterpreted with a span-like object, MemoryMarshal, etc. Otherwise its in-memory representation is exactly the same 8-bits.

TheVeryStarlk commented 1 year ago

Thanks for the reply!

In practice, NBT uses this type to denote "raw bytes" of data, and much less for its signed numerical range, which is analogous to how a byte in .NET is used, such as memory buffers.

I'm sorry I didn't understand what you mean here.

ForeverZer0 commented 1 year ago

I mean that whether the type is signed or unsigned is rarely (if ever) meaningful for this type from a practical standpoint, but it is of concern from a language, runtime, and usability viewpoint. Java happens to use signed bytes for memory buffers.while .NET uses unsigned bytes to represent arbitrary memory and booleans, which are the what this type is used for.

A boolean doesn't matter. A value is either 0 (false), or it is not (true). 0 is 0 whether for sbyte and byte.

An array of bytes that represent a memory buffer (such as a resource pack being sent from the server) are of no concern if they are signed or unsigned. The individual values have no meaning, it is simply part of a larger whole in whatever the blob of memory represents. If I chose to adhere strictly to the spec, nothing would be gained, but the library would lose cross-language support, and also require consumers to convert the data to bytes to do anything meaningful with it, as no part of the .NET API uses sbyte for the cases in which this type is used.

TheVeryStarlk commented 1 year ago

Gotcha. Thanks a lot!