Closed tannergooding closed 1 year ago
@tannergooding has done a good job of representing the viewpoints of those of us who maintain the .NET APIs. I understand not everyone is happy with those viewpoints, but at this point no additional arguments are being presented and the discussion has run its course. I'll mark this as api-ready-for-review again so that we can discuss the specifics of this proposal again at the next API review meeting opportunity and decide whether to add the proposed APIs or a subset of them or none at all. Thanks to all who shared their perspectives on the matter.
no additional arguments are being presented
Well, this discussion is now mostly an essay exchange, with timezone differences, job and sleep it takes a lot of time to produce another argument, so it mostly halted to a stop. Though
discussion has run its course
is probably correct. For single post here, there's one weighted and thought-out reply by @tannergooding here, but there are ten giggles and insults from the crowd on Discord as well. This is no longer maintainable. So, I'll address the latest arguments and will eagerly await this issue to be closed.
Bool parameters exist and are used in a plethora of places
At the cost of readability. Consider the aforementioned BigInteger:
var bytes = bigInt.ToByteArray(true, false); // is it signed little-endian? unsigned big-endian?
It's no surprise in the fact top search results for "bool method parameter" are generally negative, e.g., "What is wrong with boolean parameters?", "Is it wrong to use a boolean parameter to determine behavior", "Clean code: The curse of a boolean parameter", "Do Boolean Parameters Make You a Bad Programmer?" From the top of my head, there are three general scenarios of using boolean parameters where readability doesn't suffer:
Dispose(true)
is fine as long as Dispose(bool disposing)
or one of its overrides is presentSetIsActive(true)
guid.ToByteArray(isBigEndian: true)
Proposed overloads are to be used outside Guid.cs
file, so it's not the first case. Second case is largely absent in C# - due to property syntax availability. For third case, there are named parameter syntax in C#, but you can't actually enforce the usage (Though creating analyzer for such scenarios is a good idea). Inlay hints with parameter names alleviate the issue, though they are present only in IDE.
RFC-4122 explicitly defines itself as a sequence of named fields Per the RFC, the field layout is the same between variant 1 (0b10x) and variant 2 (0b110)
Even the original RFC-4122 document from 2005 claims to describe only variant 1, and even specifically requires big-endianness by stating in the same section:
The fields are encoded as 16 octets, with the sizes and order of the fields defined above, and with each field encoded with the Most Significant Byte first (known as network byte order)
But newer drafts define a different structure for v7 and v8, and even redefine the field structure for versions 2-5. Compare the v1 structure
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_low |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| time_mid | ver | time_high |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| clock_seq | node |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| node |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
with v4
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| random_a |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| random_a | ver | random_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| random_c |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| random_c |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
or v7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unix_ts_ms |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| unix_ts_ms | ver | rand_a |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var| rand_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| rand_b |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Also, System.Guid
is not just a Variant 2
implementation. Guid.NewGuid() produces values that claims to be Variant 1
, which can be easily checked with the following code snippet:
Console.WriteLine(Convert.ToString((Guid.NewGuid().ToByteArray()[8] & 0xF0) >> 4,2))
It produces the 10xx
pattern of Variant 1
, not 110x
of Variant 2
This issue notwithstanding, I hope Runtime and BCL will be prepared for eventual popularization of UUID v6/v7/v8
As for the "Why don't you write your own library" argument. It was addressed already, actually. And these libraries already exist. The problem is with integrating these libraries with other libraries. In .NET Community there's a general preconception against taking a hard dependency on another third-party library (fueled not only by recent drama with IdentityServer of ImageSharp, but the whole 20 years of history with open source in .NET circles). The simple, blittable (big-endian. Machines are generally little-endian, yes, but network is generally not. At least middle-endian is dead and forgotten) primitive type in BCL would help a lot in easing cross-integration and operation between these libraries. System.Uuid
is the easiest one to implement.
If it is not to be, there are alternatives, but they are much farther away, beyond the horizon. That is, support for configurable size value type fixed buffers,
public struct Binary<int n>
{
byte[n] _buffer;
}
with the support from EntityFrameworkCore in mapping these types (which would promote support for these types in other libraries) would, I hope, alleviate all the issues current UUID libraries have.
Before everything is completed, I would like to highlight a few points. Developers have been facing the non-obvious behavior of this API for a very long time. Here are a few examples:
Creating your own primitive to solve the issues described here is not difficult. Moreover, I have been maintaining one such package myself since 2019. As correctly noted above, the problem is integrating it somewhere. As a developer, it disappoints me that I have to solve such issues and related problems myself. Primitives are the work that BCL should take on. In any other popular technology stack, there is no such issue at all simply because their roots do not go deep into Windows and its API.
Made quick search in Githib of Guid.ToByteArray
usage and there are 3 common usage:
ToByteArray
returns.ToByteArray
and new Guid(byte[])
is enought.Also, some libs use own serialization mechanism for Guid npgsql.
I would like to suggest do not change Guid api to not affect case 1, 2 and add only proposed BinaryPrimitives
api to provide standardized api for Guid ->Big/Little endian conversion that can be consumed in case 3 or replace own serialization mechanizm. Also, BinaryPrimitives
distributes via nuget System.Memory and user can use new api without additional preprocessor conditions link.
is
prefixes from the isBigEndian
parametersGuid..ctor(byte[], bool)
and think it's OK to leave it off.namespace System
{
public partial struct Guid
{
// public Guid(byte[] value);
// public Guid(ReadOnlySpan<byte> value);
public Guid(ReadOnlySpan<byte> value, bool bigEndian);
// public byte[] ToByteArray();
public byte[] ToByteArray(bool bigEndian);
// public bool TryWriteBytes(Span<byte> destination);
// public bool TryWriteBytes(Span<byte> destination, out int bytesWritten); -- new in .NET 8
public bool TryWriteBytes(Span<byte> destination, bool bigEndian, out int bytesWritten);
}
}
Can you assign this issue to me?
public bool TryWriteBytes(Span<byte> destination, out int bytesWritten); -- new in .NET 8
does not exist yet.
Should it be created also?
Should this supersede #53354 ?
Should this supersede #53354 ?
Yes. #53354 can be closed.
Rationale
System.Guid
represents .NETs support forGlobally Unique Identifiers
orGUIDs
(sometimes also referred to asUniversally Unique Identifiers
orUUIDs
).This type represents a 128-bit value in the general format of
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx
where eachx
represents a hexadecimal digit, where the 4 bits ofM
represent the version and the 4 bits ofN
represent the variant number. This is sometimes referred to as the8-4-4-4-12
format string. And while the string representation is "well-defined", the actual underlying order of these bytes has a few different representation and there are several variants of the general RFC 4122 that may require a specific ordering or even may limit specific bytes to have a particular meaning..NET's
System.Guid
follows a field layout best matchingvariant 2
which is identical tovariant 1
outside the endianness. In particular,variant 1
is "big endian" andvariant 2
is "little endian". Variant 1 and 2 are used by the current UUID specification and are by far the most prominent while variant 0 is largely considered deprecated. Outside of the endianness these variants are differented by minor bit pattern requirements.Given this is largely an endianness difference and is otherwise just a minor difference in the bits used for
M
andN
, we would prefer to not introduce a new type just to handle this and would instead prefer to introduce explicit APIs and overloads onGuid
that help identify and handle these differences.Proposed APIs
Drawbacks
As discussed on https://github.com/dotnet/runtime/issues/86084, there is a general concern that users may not be aware that these other overloads exist -or- may not be aware that the difference between variant 1 and variant 2 is endianness and that .NET defaults to variant 2.
However, the same general considerations exists from exposing a new type such as
System.Uuid
. There are then additional considerations on top in that it further bifurcates the type system, doesn't easily allow polyfilling the support downlevel without shipping a new OOB package, and may further confuse users due to the frequent interchange of theGUID
andUUID
terminology.After discussion with a few other API review team members, the general consensus was that shipping a new type is undesirable and we should prefer fixing this via new APIs/overloads and potentially looking into additional ways to surface the difference to users (such as analyzers, API documentation, etc).
Additional Considerations
Given the above, we may want to consider how to help point users towards their desired APIs given the overloads on
Guid
that do not require specifying endianness.We can clearly update the documentation, but an analyzer seems like a desired addition that can help point devs towards specifying the endianness explicitly. Obsoleting the existing overloads was also proposed, but may be undesirable since the current behavior isn't "wrong", it just may be the undesired behavior in some scenarios.
We may also want to consider whether a
static Guid NewGuid()
overload that allows conforming toVersion 4, Variant 1
is desired. The docs only indicate it is version 4 and calls into the underlying System APIs. It does not indicate if it producesVariant 1
,Variant 2
, or truly random bits forN
.