dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.97k stars 4.66k forks source link

[API Proposal]: Add custom polynomial support for Crc32 and Crc64 class #78063

Open medo64 opened 1 year ago

medo64 commented 1 year ago

Background and motivation

There are many variants of CRC32 algorithm that differ only in polynomials used. The current implementation only supports IEEE 802.3 polynomial externally. It would be ideal if class would allow for defining custom polynomial thus supporting other CRC-32 variants (e.g. ISCSI, MPEG-2, etc.)

Looking at source, there is a Generate method in Crc32ReflectedTable so basic plumbing for custom polynomials seems to be there already.

As different polynomials use different initial and output XOR values, that would need to be added.

API Proposal

namespace namespace System.IO.Hashing

public sealed partial class Crc32 : NonCryptographicHashAlgorithm
{
    public Crc32(ReadOnlySpan<byte> reflectedPolynomial);
    public Crc32(ReadOnlySpan<byte> reflectedPolynomial,
                 ReadOnlySpan<byte> initialValue,
                 ReadOnlySpan<byte> xorValue,
                 bool reflectIn,
                 bool reflectOut);
}

API Usage

var crc = new Crc32(new byte[] { 0x8F, 0x6E, 0x37, 0xA0 });
crc.Append(Encoding.ASCII.GetBytes("123456789"));
var result = crc.GetCurrentHash();

Alternative Designs

No response

Risks

No response

dotnet-issue-labeler[bot] commented 1 year ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

teo-tsirpanis commented 1 year ago

The polynomial values should be integers, not spans.

medo64 commented 1 year ago

The polynomial values should be integers, not spans.

I tend to agree with you but I proposed it to be spans for the following two reasons:

  1. I believe the proper type would be uint and unsigned integers used to be frowned upon in the public API

  2. Most of functions that output hash (e.g. GetCurrentHash or GetHashAndReset) actually use spans for hash value. One function that doesn't follow that rule is GetCurrentHash that returns bytes. Since existing class/interface already uses spans for hash outputs, it seemed appropriate to accept polynomial in the same venue.

teo-tsirpanis commented 1 year ago

unsigned integers used to be frowned upon in the public API

Used to be. Not anymore. You might think of CLS compliance, which is a concept no longer relevant.

The disadvantage of accepting spans is that it falsely communicates to the user that they can pass a span of any length, while only spans of four or eight bytes are allowed.