godotengine / godot-proposals

Godot Improvement Proposals (GIPs)
MIT License
1.16k stars 97 forks source link

Add bit interfacing methods to `PackedByteArray` #8093

Open BtheDestroyer opened 1 year ago

BtheDestroyer commented 1 year ago

Describe the project you are working on

Low level networking library with binary de/serialization

Describe the problem or limitation you are having in your project

Encoding individual bits or otherwise packing data into sub-byte sizes is cumbersome, repetitive, and somehow tends to be simultaneously both overly verbose and unclear:

func encode_flags(flags : Array[bool]) -> PackedByteArray:
  var bytes := PackedByteArray()
  bytes.resize(ceil(flags.size() / 8.0))
  for i in range(flags.size()):
    bytes[i / 8] = bytes[i / 8] | (int(flags[i]) << (i % 8))
  return bytes

func encode_2u4(a : int, b : int) -> PackedByteArray:
  return PackedByteArray([(a & 0xF) | ((b & 0xF) << 4)])

Describe the feature / enhancement and how it helps to overcome the problem or limitation

I propose adding the following method signatures:

# Encodes [0, `bit_count`) least-significant bits of `source` into the array at `byte_offset` and `bit_offset`
PackedByteArray.encode_bits(byte_offset : int, bit_offset : int, source : PackedByteArray, bit_count : int)
# Encodes `flags` represented as a bitset into the array at `byte_offset` and `bit_offset`
PackedByteArray.encode_flags(byte_offset : int, bit_offset : int, flags : Array[bool])

# Creates a slice out of [0, `bit_count`) least-significant bits as an int from the array at `byte_offset` and `bit_offset`
PackedByteArray.slice_bits(byte_offset : int, bit_offset : int, bit_count : int) -> PackedByteArray
# Decodes an array of `flag_count` bools encoded as a bitset from within the array at `byte_offset` and `bit_offset`
PackedByteArray.decode_flags(byte_offset : int, bit_offset : int, flag_count : int) -> Array[bool]

Admittedly, these are the first names and signatures that came to mind, so they could do with iteration and feedback.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

Example usage:

var bytes := PackedByteArray([0, 0, 0])
bytes.encode_bits(0, 4, [0xA], 4) # [0xA0, 0x00, 0x00]
bytes.encode_bits(0, 0, [0x9], 4) # [0xA9, 0x00, 0x00]
bytes.encode_flags(1, 0, [true, false, true, true]) # [0xA9, 0x0B, 0x00]
bytes.encode_bits(1, 4, [0x12], 8) # [0xA9, 0x2B, 0x01]

print(bytes.slice_bits(1, 4, 8)) # [0x12]
print(bytes.slice_bits(0, 0, 12)) # [0xA9, 0x0B]
print(bytes.slice_bits(0, 1, 12)) # [0xD4, 0x05]
print(bytes.decode_flags(1, 0, 4)) # [true, false, true, true]

If this enhancement will not be used often, can it be worked around with a few lines of script?

Utility functions to achieve the desired effect could be implemented in GDScript as an add-on, but it would require a considerable amount of code to be flexible.

Is there a reason why this should be core and not an add-on in the asset library?

It feels like GDScript imparts a significant performance overhead for performing this low-level of memory manipulation. It could also be used in various scenarios and could have direct impact of the available functionality of any situation in which binary data serialization is relevant (eg: network packets, save files, resource/configuration files).

Calinou commented 1 year ago

Did you see the decode_u8|16|32|64() methods in PackedByteArray (and their encode_u8|16|32|64() counterparts)? These also exist for signed integers, halfs, floats and doubles. Encoding/decoding arbitrary Variants this way is also possible.

BtheDestroyer commented 1 year ago

Yes, and while those are undoubtedly useful and provide the functionality most people will want, I'm suggesting an even more fine-grained set of functionality to operate both on a more arbitrary number of bits (maybe I have a 4-bit int or a 12-bit int or an array of 1-bit bools) and with a bit-resolution offset (store starting at byte 3, bit 6).

As stated, this can be done manually (encoding booleans as bit flags within a uint is an easy example), but doing so takes a considerable amount of code and care.