Fix Randomized Node value's mcast bit in Appendix

kyzer-davis commented 1 year ago

I split this from from #140 and while I was reviewing the following statement I think there is a mistake in my test vector? Would somebody double-check?

Section 6.9: The discussion of the local/global bit here would be a good place to reference RFC 7042 or, even better, draft-ietf-intarea-ref4042bis.

https://datatracker.ietf.org/doc/html/rfc7042 Note: I am also not sure where/what to cite from this document.

Draft-11 Text:

Implementations MUST set the least significant bit of the first octet of the node ID set to 1, to create a 48 bit node id. This bit is the unicast/multicast bit, which will never be set in IEEE 802 addresses obtained from network cards. Hence, there can never be a conflict between UUIDs generated by machines with and without network cards. For compatibility with earlier specifications, note that this document uses the unicast/multicast bit, instead of the arguably more correct local/global bit because MAC addresses with the local/global bit set or not are both possible in a network. This is not the case with the unicast/multicast bit. One node cannot have a MAC address that multicasts to multiple nodes.

Could somebody double check me here: For the appendix examples, is 0x9E6BDECED846 okay for node or should I change to 0x9F6BDECED846?

9E = 10011110
9F = 10011111

tl;dr, the hex value should be x3, x7, xB, xF if that bit is set to 1 right? Bottom right square of this table: https://en.wikipedia.org/wiki/MAC_address#Ranges_of_group_and_locally_administered_addresses

Personal Note: This section of RFC4122 has always confused me. I am open to change proposals that make those two paragraphs easier to understand by implementors. They currently stand as mostly the same from 4122 in draft-11.

danielmarschall commented 1 year ago

Could somebody double check me here: For the appendix examples, is 0x9E6BDECED846 okay for node or should I change to 0x9F6BDECED846?

0x9E6BDECED846 = SAI (Unicast) https://misc.daniel-marschall.de/tools/uuid_mac_decoder/interprete_mac.php?mac=9E%3A6B%3ADE%3ACE%3AD8%3A46

0x9F6BDECED846 = SAI (Multicast) https://misc.daniel-marschall.de/tools/uuid_mac_decoder/interprete_mac.php?mac=9F%3A6B%3ADE%3ACE%3AD8%3A46

It turns out that your example is a Standard Assigned Identifier (SAI), which only IEEE may use.

The correct "legal" way would be Administrative Assigned Identifier (AAI) which have the second nibble set to x2 (Unicast) or x3 (Multicast)

0x926BDECED846 = AAI (Unicast) https://misc.daniel-marschall.de/tools/uuid_mac_decoder/interprete_mac.php?mac=92%3A6B%3ADE%3ACE%3AD8%3A46

0x936BDECED846 = AAI (Multicast) https://misc.daniel-marschall.de/tools/uuid_mac_decoder/interprete_mac.php?mac=93%3A6B%3ADE%3ACE%3AD8%3A46

danielmarschall commented 1 year ago

Personal Note: This section of RFC4122 has always confused me. I am open to change proposals that make those two paragraphs easier to understand by implementors. They currently stand as mostly the same from 4122 in draft-11.

@kyzer-davis What exactly are you confused about in this section? Maybe I can help simplify it.

kyzer-davis commented 1 year ago

Okay, I modified _random_getnode() over on Python's UUIDv1 Implementation so it always uses random. Line: https://gist.github.com/kyzer-davis/45cd2815b6c0bb9861a2a4f7de6d798a#file-uuid-py-L705

I ran 40 or so UUIDv1 generations and they always had a value of 1 in that multicast bit section which was the entire bottom row of that Wikipedia table. Values x1, x3, x5, x7, x9, xB, xD, xF at the least significant bit of the first octet in the node. Testing File: https://gist.github.com/kyzer-davis/45cd2815b6c0bb9861a2a4f7de6d798a#file-testing-txt Note: I also printed the before/after modification so I could see if there was a change to the random data.

I then spoofed the random.getrandbits call to be that of my integer value for 9E6BDECED846 which is 174186136787014 The result was then 174186136787014 | (1 << 40) which converted to 175285648414790 When converting to hex it was in fact xF: 9F6BDECED846 Line: https://gist.github.com/kyzer-davis/45cd2815b6c0bb9861a2a4f7de6d798a#file-uuid-py-L613

>>> import uuid
>>> print(uuid.uuid1())
174186136787014
175285648414790
2aa8994f-588a-11ee-acef-9f6bdeced846
>>>

I whipped up a proposal below. The text addition to Appendix C aims to make this a bit modification more clear. Then I can cite this up in the earlier text as a helper. Edit, added citation to RFC7042 like IESG wanted. Edit2. Forgot some modifications.

informative:
  RFC7042: RFC7042

## UUIDs That Do Not Identify the Host {#unidentifiable}
[..truncated..]
This bit is the unicast/multicast bit, which will never be set in IEEE 802
addresses obtained from network cards.  Hence, there can never be a
conflict between UUIDs generated by machines with and without network
cards. 
An example of this modification can be observed in {{test_vectors}} appendix.
For more information about IEEE 802 address and the unicast/multicast or local/global bits please review {{RFC7042}}.

# Test Vectors {#test_vectors}
[..truncated..]
Both UUIDv1 and UUIDv6 utilize the same values in clock_seq, and node. 
All of which have been generated with random data. 
For the randomized node, the least significant bit of the first octet set to a value of 1 as per {{unidentifiable}}. 
Thus the starting value 0x9E6BDECED846 was changed to 0x9F6BDECED846. 
Figure {{randomizedNodeModify}} details the bit position and conversion of this bit where X is the value of the bit being modified from 0 to 1.

~~~~
Octet:      0        1        2        3        4        5 
Modify Bit: -------X -------- -------- -------- -------- --------
Start:      10011110 01101011 11011110 11001110 11011000 01000110
After:      10011111 01101011 11011110 11001110 11011000 01000110
~~~~
{: id='randomizedNodeModify' title='Example Bit Modification of Randomized Node'}

## Example of a UUIDv1 Value {#uuidv1_example}
[..truncated..]
node       48   0x9F6BDECED846

## Example of a UUIDv6 Value {#uuidv6_example}
[..truncated..]
node        48   0x9F6BDECED846

Update: starting octet to 0

LiosK commented 1 year ago

Personally, I feel the most confusing thing in Section 6.9 is the word "low" in:

Implementations obtain a 47 bit cryptographic-quality random number as per Section 6.8 and use it as the low 47 bits of the node ID.

It's from the original RFC 4122, but still I don't really understand it.

The C code in RFC 4122 suggests that 0x9E6BDECED846 should be 0x9F6BDECED846 for v1. But at the same time, I'm afraid that emphasizing the unicast/multicast bit too much in the test vector might mislead readers as if the unicast/multicast bit must be set to 1 for v6, too. My understanding is that the primary option for v6 is a fully random node ID that doesn't care the unicast/multicast bit.

Is it possible to use 0x9E6BDECED846 for v6 and 0x9F6BDECED846 for v1? Saying like: "Both UUIDv1 and UUIDv6 utilize the same random number, 0x9E6BDECED846, for node, but for UUIDv1 the least significant bit of the first octet is set to a value of 1 to illustrate Section 6.9."

Nits: The octet number in the table should start from zero to follow the convention in the document or from 10 to indicate the actual location in the 128-bit space.

kyzer-davis commented 1 year ago

Personally, I feel the most confusing thing in Section 6.9 is the word "low" in:

Yeah, and Python just generates 48 then swaps the right one... I think I can edit that text to be as follows. This is in line with how it is done in v4. e.g Generate 128 and change the 6 bits that matter.

v1/v6 follow same rules on Node ID so they both adhere to 6.9 for random. More specifically that bit.

As for over-emphasizing, I get it. I can skip that diagram. I was hoping to help clarify exactly where but if the text is clear to others (minus some minor changes proposed here.) I can simply skip all that fluff and just fix the value.

As for 0 vs 1. I updated the proposed text to start at 0.

kyzer-davis commented 1 year ago

Okay, I fixed this in https://github.com/ietf-wg-uuidrev/rfc4122bis/pull/152/commits/db976a54fe261176502f8ed39891b21afc1d432c

Note: I opted to drop my ASCII bit table. I think the bit modification is sufficiently covered now (and fixes my error in the test vector) and cites the RFC the IESG wanted.

LiosK commented 1 year ago

v1/v6 follow same rules on Node ID so they both adhere to 6.9 for random. More specifically that bit.

Though I don't have a strong opinion on the test vectors, using 0x9E6BDECED846 for v6 will help clarify Section 5.6. It reads v6 SHOULD use full 48-bit random refreshed every time for node but MAY follow Section 5.1 (and 6.9 accordingly). So v6 MAY follow Section 6.9 but SHOULD depart from it, right?

kyzer-davis commented 1 year ago

@LiosK, that is interpretation is correct as the text currently reads in draft-11 (and current draft-12) section 5.6.

ietf-wg-uuidrev / rfc4122bis

Fix Randomized Node value's mcast bit in Appendix #151