Proposal: Bit field extract/deposit APIs on BitOperations

alexrp commented 3 years ago

Background and Motivation

When dealing with low-level file formats, network protocols, instruction encodings, and other such things, it's common to need to extract and deposit bit fields of various sizes. Writing generic versions of these operations (i.e. without a constant mask) can be fairly error-prone. Also, there is room for exploiting certain architecture-specific intrinsics (e.g. x86 BMI) similar to what the current methods on BitOperations do.

Proposed API

 namespace System.Numerics
 {
     public static class BitOperations
     {
+        public static uint GetUnsignedBitRange(uint value, int index, int count);
+        public static ulong GetUnsignedBitRange(ulong value, int index, int count);
+        public static nuint GetUnsignedBitRange(nuint value, int index, int count);
+
+        public static int GetSignedBitRange(uint value, int index, int count);
+        public static long GetSignedBitRange(ulong value, int index, int count);
+        public static nint GetSignedBitRange(nuint value, int index, int count);
+
+        public static uint SetBitRange(uint value1, uint value2, int index, int count);
+        public static ulong SetBitRange(ulong value1, ulong value2, int index, int count);
+        public static nuint SetBitRange(nuint value1, nuint value2, int index, int count);
     }
 }

(I'm not necessarily married to these method names.)

Acceptable `count` Values

A design choice has to be made with regards to count=0 and count=32 (for uint) / count=64 (for ulong). You generally only get to support one or the other in branchless code. There are 4 options that I see:

Leave both cases undefined.
Handle count=0 properly.
Handle count=32 / count=64 properly.
Handle both properly and pay the branch cost.

I think (4) is probably a non-starter, at least until the JIT gets full cmov support. Between the other 3 options, I have no strong preference, but would probably lean towards (3) for no particular reason other than it feels right and it's what other projects do. (I can't think of any real use cases for such 'weird' count values anyway.)

Usage Examples

One of my projects is full of code looking like this:

using static Bits;

public sealed class LoadDisplacementInstruction : Instruction
{
    public Size Size { get; set; }

    public int Immediate { get; set; }

    public uint SourceRegister { get; set; }

    public uint DestinationRegister { get; set; }

    public override void Decode(uint value)
    {
        Size = (Size)ExtractUnsignedField(Value, 5, 2);
        Immediate = ExtractSignedField(Value, 7, 3);
        SourceRegister = ExtractUnsignedField(Value, 10, 3);
        DestinationRegister = ExtractUnsignedField(Value, 13, 3);
    }

    public override void Encode(ref uint value)
    {
        value = DepositField(value, 0b00100, 0, 5);
        value = DepositField(value, (uint)Size, 5, 2);
        value = DepositField(value, (uint)Immediate, 7, 3);
        value = DepositField(value, SourceRegister, 10, 3);
        value = DepositField(value, DestinationRegister, 13, 3);
    }
}

Alternative Designs

The API proposed here is loosely based on the functions in QEMU: https://git.qemu.org/?p=qemu.git;a=blob;f=include/qemu/bitops.h;h=3acbf3384c63ce4f7d2a98db962ae999de3b2b63;hb=HEAD

Risks

None?

Potential Implementation

A basic implementation could look something like this:

public static class BitOperations
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    static uint Mask32(int count)
    {
        return uint.MaxValue >> sizeof(uint) * 8 - count;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    static ulong Mask64(int count)
    {
        return ulong.MaxValue >> sizeof(ulong) * 8 - count;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static uint ExtractUnsignedField(uint value, int index, int count)
    {
        return value >> index & Mask32(count);
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static ulong ExtractUnsignedField(ulong value, int index, int count)
    {
        return value >> index & Mask64(count);
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static int ExtractSignedField(uint value, int index, int count)
    {
        return (int)(value << sizeof(uint) * 8 - count - index) >> sizeof(uint) * 8 - count;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static long ExtractSignedField(ulong value, int index, int count)
    {
        return (long)(value << sizeof(ulong) * 8 - count - index) >> sizeof(ulong) * 8 - count;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static uint DepositField(uint value1, uint value2, int index, int count)
    {
        var mask = Mask32(count);

        return value1 & ~(mask << index) | (value2 & mask) << index;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static ulong DepositField(ulong value1, ulong value2, int index, int count)
    {
        var mask = Mask64(count);

        return value1 & ~(mask << index) | (value2 & mask) << index;
    }
}

(This implementation chooses option (3) for count values.)

dotnet-issue-labeler[bot] commented 3 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost commented 3 years ago

Tagging subscribers to this area: @tannergooding, @pgovind See info in area-owners.md if you want to be subscribed.

Issue Details

## Background and Motivation When dealing with low-level file formats, network protocols, instruction encodings, and other such things, it's common to need to extract and deposit bit fields of various sizes. Writing generic versions of these operations (i.e. without a constant mask) can be fairly error-prone. Also, there is room for exploiting certain architecture-specific intrinsics (e.g. x86 BMI) similar to what the current methods on `BitOperations` do. ## Proposed API ```diff namespace System.Numerics { public static class BitOperations { + public static uint AndNot(uint value1, uint value2); + public static ulong AndNot(ulong value1, ulong value2); + + public static uint Extract(uint value, int index, int count); + public static ulong Extract(ulong value, int index, int count); + + public static int SignedExtract(uint value, int index, int count); + public static long SignedExtract(ulong value, int index, int count); + + public static uint Deposit(uint value1, uint value2, int index, int count); + public static ulong Deposit(ulong value1, ulong value2, int index, int count); } } ``` (I'm not necessarily married to these method names.) ## `AndNot` Method This method is useful in the implementation of the other methods and also more broadly for bit operations. It should use the BMI `andn` instruction where available and take care of the pitfall where that instruction expects its operands in the opposite order (i.e. `BitOperations.AndNot(a, b)` -> `Bmi1.AndNot(b, a)`). ## Acceptable `count` Values A design choice has to be made with regards to `count=0` and `count=32` (for `uint`) / `count=64` (for `ulong`). You generally only get to support one or the other in branchless code. There are 4 options that I see: 1. Leave both cases undefined. 2. Handle `count=0` properly. 3. Handle `count=32` / `count=64` properly. 4. Handle both properly and pay the branch cost. I think (4) is probably a non-starter, at least until the JIT gets full `cmov` support. Between the other 3 options, I have no strong preference, but would probably lean towards (3) for no particular reason other than it feels right and it's what other projects do. (I can't think of any real use cases for such 'weird' `count` values anyway.) ## Usage Examples One of my projects is full of code looking like this: ```csharp using static Bits; public sealed class LoadDisplacementInstruction : Instruction { public Size Size { get; set; } public int Immediate { get; set; } public uint SourceRegister { get; set; } public uint DestinationRegister { get; set; } public override void Decode(uint value) { Size = (Size)ExtractField(Value, 5, 2); Immediate = SignedExtractField(Value, 7, 3); SourceRegister = ExtractField(Value, 10, 3); DestinationRegister = ExtractField(Value, 13, 3); } public override void Encode(ref uint value) { value = DepositField(value, 0b00100, 0, 5); value = DepositField(value, (uint)Size, 5, 2); value = DepositField(value, (uint)Immediate, 7, 3); value = DepositField(value, SourceRegister, 10, 3); value = DepositField(value, DestinationRegister, 13, 3); } } ``` ## Alternative Designs The API proposed here is loosely based on the functions in QEMU: https://git.qemu.org/?p=qemu.git;a=blob;f=include/qemu/bitops.h;h=3acbf3384c63ce4f7d2a98db962ae999de3b2b63;hb=HEAD ## Risks None. ## Potential Implementation A basic implementation could look something like this: ```csharp public static class BitOperations { [MethodImpl(MethodImplOptions.AggressiveInlining)] public static uint AndNot(uint value1, uint value2) { return Bmi1.IsSupported ? Bmi1.AndNot(value2, value1) : value1 & ~value2; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static ulong AndNot(ulong value1, ulong value2) { return Bmi1.X64.IsSupported ? Bmi1.X64.AndNot(value2, value1) : value1 & ~value2; } [MethodImpl(MethodImplOptions.AggressiveInlining)] static uint Mask32(int count) { return uint.MaxValue >> sizeof(uint) * 8 - count; } [MethodImpl(MethodImplOptions.AggressiveInlining)] static ulong Mask64(int count) { return ulong.MaxValue >> sizeof(ulong) * 8 - count; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static uint ExtractField(uint value, int index, int count) { return Bmi1.IsSupported ? Bmi1.BitFieldExtract(value, (byte)index, (byte)count) : value >> index & Mask32(count); } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static ulong ExtractField(ulong value, int index, int count) { return Bmi1.X64.IsSupported ? Bmi1.X64.BitFieldExtract(value, (byte)index, (byte)count) : value >> index & Mask64(count); } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static int ExtractSignedField(uint value, int index, int count) { return (int)(value << sizeof(uint) * 8 - count - index) >> sizeof(uint) * 8 - count; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static long ExtractSignedField(ulong value, int index, int count) { return (long)(value << sizeof(ulong) * 8 - count - index) >> sizeof(ulong) * 8 - count; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static uint DepositField(uint value1, uint value2, int index, int count) { var mask = Mask32(count); return AndNot(value1, mask << index) | (value2 & mask) << index; } [MethodImpl(MethodImplOptions.AggressiveInlining)] public static ulong DepositField(ulong value1, ulong value2, int index, int count) { var mask = Mask64(count); return AndNot(value1, mask << index) | (value2 & mask) << index; } } ``` (There's probably a lot more room for optimization and architecture-specific intrinsics here.)

Author:	alexrp
Assignees:	-
Labels:	`api-suggestion`, `area-System.Numerics`, `untriaged`
Milestone:	-

GrabYourPitchforks commented 3 years ago

Just my two cents, I would much rather write a & ~b than BitOperations.AndNot(a, b). That really seems like something the JIT should be optimizing on the dev's behalf based on the current machine's capabilities.

tannergooding commented 3 years ago

I agree. Particularly for simple arithmetic like a & ~b where there is basically one way to right it, I think the JIT recognizing and optimizing the expression is better.

It would be good to restrict BitOperations to more complex cases where the patterns are many or particularly complex, such as Rotate or http://graphics.stanford.edu/~seander/bithacks.html

En3Tho commented 3 years ago

I recall Erlang has a special type for working with bits. Do we need something like that in .Net? This api is good but still you have to do it manually. A special syntax/model would be cleaner I guess. Here is the thing I'm talking about: https://erlang.org/doc/programming_examples/bit_syntax.html

alexrp commented 3 years ago

Just my two cents, I would much rather write a & ~b than BitOperations.AndNot(a, b). That really seems like something the JIT should be optimizing on the dev's behalf based on the current machine's capabilities.

No objections from me there. In fact, I'm going to scrap the AndNot part of this proposal anyway because further investigation on my end revealed what should probably have been obvious: It severely inhibits the JIT's ability to do even simple optimizations like constant folding to the point where there's probably no gain to be had from using it manually.

alexrp commented 3 years ago

I recall Erlang has a special type for working with bits. Do we need something like that in .Net? This api is good but still you have to do it manually. A special syntax/model would be cleaner I guess. Here is the thing I'm talking about: https://erlang.org/doc/programming_examples/bit_syntax.html

There is a discussion over on csharplang for something along the lines of that, though it's more like C bit fields than Erlang bit syntax: https://github.com/dotnet/csharplang/discussions/465

Regardless, it would be good to have these primitives in place.

alexrp commented 3 years ago

@tannergooding what do you think about the API as currently proposed?

tannergooding commented 3 years ago

@alexrp:

I'm not a fan of Field. I think ConsecutiveBits or similar (something containing Bits) would be clearer
I'm not a fan of Deposit. Insert or Replace are the terms that immediately come to mind as alternatives
I think we should bring along nint/nuint overloads at the same time.
I'd like to think more on if we can easily handle 0 and 32 via a few correct shifts

alexrp commented 3 years ago

I've updated the API shape per our discussion on Discord.

alexrp commented 2 years ago

What needs to happen next to move this forward?

dotnet / runtime