ietf-wg-uuidrev / rfc4122bis

revision to RFC4122
Other
57 stars 11 forks source link

Alternative proposal for Hashspace ID Values #143

Closed kyzer-davis closed 11 months ago

kyzer-davis commented 1 year ago

From Paul Wouters

Why are hashspace IDs chosen to look like random uuids? Eg why not encode
"SHA2_224" (hex 534841325F323234) as 53484132-5F32-3234-0000-000000000000
or 00000000-0000-0000-5348-41325F323234 (plus or minus variant/version bits)
so that it becomes far more clear this is not an ordinary random uuid?

Current, All random

SHA2_224     = "59031ca3-fbdb-47fb-9f6c-0f30e2e83145"
SHA2_256     = "3fb32780-953c-4464-9cfd-e85dbbe9843d"
SHA2_384     = "e6800581-f333-484b-8778-601ff2b58da8"
SHA2_512     = "0fde22f2-e7ba-4fd1-9753-9c2ea88fa3f9"
SHA2_512_224 = "003c2038-c4fe-4b95-a672-0c26c1b79542"
SHA2_512_256 = "9475ad00-3769-4c07-9642-5e7383732306"
SHA3_224     = "9768761f-ac5a-419e-a180-7ca239e8025a"
SHA3_256     = "2034d66b-4047-4553-8f80-70e593176877"
SHA3_384     = "872fb339-2636-4bdd-bda6-b6dc2a82b1b3"
SHA3_512     = "a4920a5d-a8a6-426c-8d14-a6cafbe64c7b"
SHAKE_128    = "7ea218f6-629a-425f-9f88-7439d63296bb"
SHAKE_256    = "2e7fc6a4-2919-4edc-b0ba-7d7062ce4f0a"
kyzer-davis commented 1 year ago

Alternative from me: Pick a random UUID to start, increment the lowest bits by 1 for each.

Logic for Namespace is a specific UUIDv1 6ba7b810-9dad-11d1-80b4-00c04fd430c8 starts and next is 6ba7b811-9dad-11d1-80b4-00c04fd430c8 is next where the 6ba7b810 and 6ba7b811 increments for each up through 0-4 in that position.

Increment by UUID in least significant position.

SHA2_224     = "59031ca3-fbdb-47fb-9f6c-000000000000"
SHA2_256     = "59031ca3-fbdb-47fb-9f6c-000000000001"
SHA2_384     = "59031ca3-fbdb-47fb-9f6c-000000000002"
SHA2_512     = "59031ca3-fbdb-47fb-9f6c-000000000003"
SHA2_512_224 = "59031ca3-fbdb-47fb-9f6c-000000000004"
SHA2_512_256 = "59031ca3-fbdb-47fb-9f6c-000000000005"
SHA3_224     = "59031ca3-fbdb-47fb-9f6c-000000000006"
SHA3_256     = "59031ca3-fbdb-47fb-9f6c-000000000007"
SHA3_384     = "59031ca3-fbdb-47fb-9f6c-000000000008"
SHA3_512     = "59031ca3-fbdb-47fb-9f6c-000000000009"
SHAKE_128    = "59031ca3-fbdb-47fb-9f6c-00000000000A"
SHAKE_256    = "59031ca3-fbdb-47fb-9f6c-00000000000B"
kyzer-davis commented 1 year ago

Paul's proposal, TEXT to HEX, is tough because the current hashspace ID labels for a value is at minimum 16 characters and at max 24 characters after encoding in hex. (See the end) The Version/Variant bits need to be set which leaves position that isn't ideal to slot these in easily.

xxxxxxxx-xxxx-Mzzz-Nyyy-yyyyyyyyyyyy

Text to Hex

SHA2_224     = 534841325f323234
SHA2_256     = 534841325f323536
SHA2_384     = 534841325f333834
SHA2_512     = 534841325f353132
SHA2_512_224 = 534841325f3531325f323234
SHA2_512_256 = 534841325f3531325f323536
SHA3_224     = 534841335f323234
SHA3_256     = 534841335f323536
SHA3_384     = 534841335f333834
SHA3_512     = 534841335f353132
SHAKE_128    = 5348414b455f313238
SHAKE_256    = 5348414b455f323536

Edit: I could change the names labels, remove underscore but it does not scale nicely unless they can all be 12-15 chars after the encoding or one must navigate the Ver/Var hex.

danielmarschall commented 1 year ago

For my online service I have done the following approach:

HashSpaceUuid<AlgoName> := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", AlgoName).

which results in the following UUIDs:

-- Algorithms from this draft (revision 00-11)
-- The payload of the UUIDv5 is the algorithms name from PHP hash_algos(), as well as "shake128" and "shake256"
SHA224                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha224")      = "59031ca3-fbdb-47fb-9f6c-0f30e2e83145".
SHA256                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha256")      = "3fb32780-953c-4464-9cfd-e85dbbe9843d".
SHA384                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha384")      = "e6800581-f333-484b-8778-601ff2b58da8".
SHA512                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha512")      = "0fde22f2-e7ba-4fd1-9753-9c2ea88fa3f9".
SHA512/224             := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha512/224")  = "003c2038-c4fe-4b95-a672-0c26c1b79542".
SHA512/256             := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha512/256")  = "9475ad00-3769-4c07-9642-5e7383732306".
SHA3/224               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha3-224")    = "9768761f-ac5a-419e-a180-7ca239e8025a".
SHA3/256               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha3-256")    = "2034d66b-4047-4553-8f80-70e593176877".
SHA3/384               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha3-384")    = "872fb339-2636-4bdd-bda6-b6dc2a82b1b3".
SHA3/512               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha3-512")    = "a4920a5d-a8a6-426c-8d14-a6cafbe64c7b".
SHAKE128               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "shake128")    = "7ea218f6-629a-425f-9f88-7439d63296bb".
SHAKE256               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "shake256")    = "2e7fc6a4-2919-4edc-b0ba-7d7062ce4f0a".

-- Other algorithms
-- Excluded are algorithms whose output is too short to fit into an UUIDv8
-- The payload of the UUIDv5 is the algorithms name from PHP hash_algos()
GOST                   := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "gost")        = "be782e40-b9e8-59c4-8500-31a6cfb91a75".
GOST-CryptoProParamSet := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "gost-crypto") = "9c1d4a70-75ec-5c6a-84e2-09b400fe8f21".
HAVAL-3-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval128,3")  = "176e81e1-9fc8-50f3-b569-08f264e5ae58".
HAVAL-3-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval160,3")  = "8d160752-d034-54e0-ac73-930ec60580c2".
HAVAL-3-192            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval192,3")  = "1f53cfc9-a36c-5a36-b27c-6dc88074ca38".
HAVAL-3-224            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval224,3")  = "56ef61fc-16de-55f4-bd3f-f44856d3d436".
HAVAL-3-256            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval256,3")  = "66111477-b9e1-54cc-a38f-b73b228964cc".
HAVAL-4-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval128,4")  = "55d554f5-7c2e-5e08-a8e3-cd6ecbac1e32".
HAVAL-4-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval160,4")  = "5d3d8b32-d57d-54b4-9a01-342e8fa9df5b".
HAVAL-4-192            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval192,4")  = "c7c66b4d-4299-5489-aa29-991b7bd4aa52".
HAVAL-4-224            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval224,4")  = "5c86a1f5-6b47-576e-a900-087897bf83a7".
HAVAL-4-256            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval256,4")  = "5851bbb5-56a3-55b3-8775-371739d251ca".
HAVAL-5-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval128,5")  = "9d40aac4-d8e5-5846-ad91-dc3429294d0d".
HAVAL-5-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval160,5")  = "764e5acb-88c4-5b24-b3bd-cca0941de88a".
HAVAL-5-192            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval192,5")  = "4455573f-5ff9-5cac-aadc-7ecf5c0c7ad1".
HAVAL-5-224            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval224,5")  = "0336bbe3-f703-5184-b52d-4e9d8163ddcd".
HAVAL-5-256            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval256,5")  = "5f4a8511-9e92-5d62-b94d-0b910a1e3d9a".
MD2                    := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "md2")         = "6ca7dd19-4755-5c6a-8b3f-3056ef6bebf6".
MD4                    := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "md4")         = "15329616-0af7-535a-b4b6-41c4eba21457".
Murmur3c               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "murmur3c")    = "b941a86c-9e70-5044-9496-da00eec9b934".
Murmur3f               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "murmur3f")    = "4a2262be-0dec-587f-843a-eb50c707d779".
RIPEMD128              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "ripemd128")   = "efd0677a-e9f4-5337-8764-51be1c353d4a".
RIPEMD160              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "ripemd160")   = "b54b1a0a-ce07-5d4b-9d03-96d57da2bf29".
RIPEMD256              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "ripemd256")   = "e288aa2a-5260-5aaf-825a-f40ce0514d19".
RIPEMD320              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "ripemd320")   = "2919713b-ae42-58a3-916a-039989f07300".
SNEFRU                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "snefru")      = "d38f5891-c553-5d58-88de-199cbf48291e".
SNEFRU256              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "snefru256")   = "30b628e4-4587-5f06-ae1b-be9d0cba1187".
Tiger-3-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger128,3")  = "6f5ba86a-a362-50f1-bc6a-62787ee998b8".
Tiger-4-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger128,4")  = "13ff3c12-da5d-5437-89d9-a76e44abd0c8".
Tiger-3-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger160,3")  = "50d3d8af-6a6c-5ea3-bfad-450424668dee".
Tiger-4-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger160,4")  = "31d39089-28e0-584b-bf72-7cf33c2caeea".
Tiger-3-192            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger192,3")  = "63cfdad3-a720-55e2-83a4-b8c762ad4012".
Tiger-4-195            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger192,4")  = "5b07ff46-f679-5d3c-97e1-10b800be9246".
WHIRLPOOL              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "whirlpool")   = "74fd261c-1f13-5015-81cc-fdc9a7354ae5".
XXH128                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "xxh128")      = "1a66c377-af3d-5f24-b9fb-d6c067d8b588".

What do you think about it?

fabiolimace commented 1 year ago

I know that not being a hash function expert, I shouldn't even question the advantage and simplicity of incrementing a base UUID. However, in terms of avalanche effect, which is better: changing only 1 bit or changing 128 bits changing more bits?

I also know that it is not a requirement that an ID produced with a hashspace has a very high probability of not clashing with IDs produced with a different hashspace. SHA-x algorithms already guarantee that changing 1 bit in the input produces a drastically different output, so this probability must be extremely low to be taken into account. However, if it is possible to maximize this effect by changing as many input bits as possible, wouldn't that be more desirable?

--

EDIT: I crossed out the "changing 128 bits" phrase because changing 128 bits means inverting all bits, which is the result of an XOR operation. It seems more appropriate to change more than 1 bit, but not all.

fabiolimace commented 1 year ago

For my online service I have done the following approach:

HashSpaceUuid := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", AlgoName).

Sounds logical to me and easy to describe as you only need to define the namespace UUID that will be used with the UUIDv5 function to generate the hashspaces for each algorithm name.

However, it is necessary to use the "canonical name" of each algorithm, which implies text encoding (UTF-8, ASCII etc), case sensitivity (uppercase, lowercase), use (or not) of "non-word" characters (dash, space), etc.

danielmarschall commented 1 year ago

Yes, the canonical name is indeed my concern, too. ~(I wonder, is there a RFC that defines naming schemes for algorithms?)~

Proposal 1

I have the following idea:

Here is a proposal (I have calculated the UUIDv5, but please double-check them):

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based UUIDs.

   The following UUIDs were created by using a UUIDv5 with namespace ID
   "1ee317e2-1853-64b2-8fe9-3c4a92df8582" and the algorithm name in
   upper-case and with underscores as data.  This mechanism of generating a
   hashspace ID is OPTIONAL.  Any UUID can be used as a hashspace UUID.

   SHA2_224     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_224")     = "5385c476-6ffc-578a-908f-91b5cd2eac03"
   SHA2_256     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_256")     = "f660b1c5-f2c9-5f3a-981f-8652227fc329"
   SHA2_384     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_384")     = "43794fb1-7e34-558f-a8a5-5b4b8f8470d5"
   SHA2_512     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_512")     = "250cb2ab-c480-5f24-83fd-16ea8b0b9e36"
   SHA2_512_224 = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_512_224") = "70bade2b-c68a-5894-b31d-7c3581b6c647"
   SHA2_512_256 = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_512_256") = "a05efbcf-0a2a-5aab-9c62-8c94d05e0760"
   SHA3_224     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA3_224")     = "2862da96-f3c7-586a-8cc3-b1f424cdf040"
   SHA3_256     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA3_256")     = "72727812-3cea-56bd-a57f-ed3445acca4f"
   SHA3_384     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA3_384")     = "279978a1-86d1-56e6-bce2-019f5eaa3437"
   SHA3_512     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA3_512")     = "33e6927a-d382-5dbd-b415-402610340bcd"
   SHAKE_128    = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHAKE_128")    = "8835c536-6ab4-55bc-be61-7029cdcbd1db"
   SHAKE_256    = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHAKE_256")    = "311a1f8e-0a71-554a-bcdb-436c4e9f55e8"
fabiolimace commented 1 year ago

Here is a proposal (I have calculated the UUIDv5, but please double-check them):

I prefer to keep the previous defined UUIDv4-based hashspaces, but I think this UUIDv5 mechanism is a better way to define pseudo-random or "random-looking" hashspaces which can be easily reproduced to define new hashspaces for cryptographic hash functions that could not be included in the document.

I just don't know which are the current canonical names for the SHA-2 family. For example, Wikipedia and Java use SHA-256 (with a dash), but not SHA2_256 (with a 2 and an underline).

P.S.: can we use this document as a reference?: https://csrc.nist.gov/files/pubs/fips/180-4/upd1/final/docs/fips180-4-draft-aug2014.pdf

danielmarschall commented 1 year ago

Proposal 2

There is another method which does not rely on canonical names (or even English language) at all.

A lot of hash algorithms are identified by OIDs. Some of them are located in this arc: http://oid-info.com/get/2.16.840.1.101.3.4.2

We could use a UUIDv5 with namespace OID (6ba7b812-9dad-11d1-80b4-00c04fd430c8) and the hash algorithm OID as payload.

Here is my proposal:

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based UUIDs.

   The following UUIDs were created by using a UUIDv5 with the OID namespace ID
   ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the OID identifying the
   hash algorithm.  This mechanism of generating a hashspace ID is OPTIONAL.
   Any UUID can be used as a hashspace UUID.

   SHA-224      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4")  = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.1")  = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.2")  = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.3")  = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.5")  = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.6")  = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.7")  = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.8")  = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.9")  = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.10") = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.11") = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.12") = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

(Edit after my initial post: Changed the algorithm names as defined by FIPS180-4 and FIPS202)

(Note to self:) Here is a list of Algorithms/OIDs I have found:

GOST = 1.2.643.2.2.30.0
GOST-CryptoProParamSet = 1.2.643.2.2.30.1
GOST3410-2012-256 = 1.2.643.7.1.1.3.2
GOST3410-2012-512 = 1.2.643.7.1.1.3.3
HAVAL-3-128 = 1.3.6.1.4.1.18105.2.1.1.1
HAVAL-3-160 = 1.3.6.1.4.1.18105.2.1.1.2
HAVAL-3-192 = 1.3.6.1.4.1.18105.2.1.1.3
HAVAL-3-224 = 1.3.6.1.4.1.18105.2.1.1.4
HAVAL-3-256 = 1.3.6.1.4.1.18105.2.1.1.5
HAVAL-4-128 = 1.3.6.1.4.1.18105.2.1.1.6
HAVAL-4-160 = 1.3.6.1.4.1.18105.2.1.1.7
HAVAL-4-192 = 1.3.6.1.4.1.18105.2.1.1.8
HAVAL-4-224 = 1.3.6.1.4.1.18105.2.1.1.9
HAVAL-4-256 = 1.3.6.1.4.1.18105.2.1.1.10
HAVAL-5-128 = 1.3.6.1.4.1.18105.2.1.1.11
HAVAL-5-160 = 1.3.6.1.4.1.18105.2.1.1.12
HAVAL-5-192 = 1.3.6.1.4.1.18105.2.1.1.13
HAVAL-5-224 = 1.3.6.1.4.1.18105.2.1.1.14
HAVAL-5-256 = 1.3.6.1.4.1.18105.2.1.1.15
ISO/IEC 10118-2 "Hash Function 1" = 1.0.10118.2.0.1
ISO/IEC 10118-2 "Hash Function 2" = 1.0.10118.2.0.2
ISO/IEC 10118-2 "Hash Function 3" = 1.0.10118.2.0.3
ISO/IEC 10118-2 "Hash Function 4" = 1.0.10118.2.0.4
MD2 = 1.2.840.113549.2.2
MD4 = 1.2.840.113549.2.4
MD5  = 1.2.840.113549.2.5  (use UUIDv3)
MURMUR3C = ???
MURMUR3F = ???
Modular Arithmetic Secure Hash 1 (MASH-1) algorithm = 1.0.10118.4.0.65
Modular Arithmetic Secure Hash 2 (MASH-2) algorithm = 1.0.10118.4.0.66
RIPEMD128 = 1.3.36.3.2.2 or 1.0.10118.3.0.50
RIPEMD160 = 1.3.36.3.2.1 or 1.0.10118.3.0.49
RIPEMD256 = 1.3.36.3.2.3
RIPEMD320 = ???
SHA-224 = 2.16.840.1.101.3.4.2.4
SHA-256 = 2.16.840.1.101.3.4.2.1
SHA-384 = 2.16.840.1.101.3.4.2.2
SHA-512 = 2.16.840.1.101.3.4.2.3
SHA-512/224 = 2.16.840.1.101.3.4.2.5
SHA-512/256 = 2.16.840.1.101.3.4.2.6
SHA0 = OID = 1.3.14.3.2.18
SHA1 = 1.3.14.3.2.26  (use UUIDv5)
SHA3-224 = 2.16.840.1.101.3.4.2.7
SHA3-256 = 2.16.840.1.101.3.4.2.8
SHA3-384 = 2.16.840.1.101.3.4.2.9
SHA3-512 = 2.16.840.1.101.3.4.2.10
SHAKE-128 = 2.16.840.1.101.3.4.2.11
SHAKE-256 = 2.16.840.1.101.3.4.2.12
SM3 (ISO/IEC 10118-3) = 1.0.10118.3.0.65
SNEFRU = ???
SNEFRU256 = ???
Streebog 256 = 1.0.10118.3.0.60
Streebog 512 = 1.0.10118.3.0.59
TIGER-3-128 = ???
TIGER-3-160 = ???
TIGER-3-192 = ??? (1.3.6.1.4.1.11591.12.2 specifies 192 bits, but rounds are unknown)
TIGER-4-128 = ???
TIGER-4-160 = ???
TIGER-4-192 = ???
WHIRLPOOL = 1.0.10118.3.0.55
XXH128 = ???
fabiolimace commented 1 year ago

@danielmarschall

I think it's way better.

Why not using the URN notation in lowercase mode only, e.g. urn:oid:2.16.840.1.101.3.4.2.4?

danielmarschall commented 1 year ago

@fabiolimace Are you confused about my notation UUIDv5(OID 2.16.840.1.101.3.4.2.6) ? With this notation I meant taking the OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and use the OID "2.16.840.1.101.3.4.2.6" as payload.

fabiolimace commented 1 year ago

Sorry I meant the string "urn:oid:2.16.840.1.101.3.4.2.4" as the name input for the UUIDv5 function.

This:

   SHA2_224     = UUIDv5(urn:oid:2.16.840.1.101.3.4.2.4)  = 85eed581-369c-5931-a7fe-0d8158e83871

Not this:

   SHA2_224     = UUIDv5(OID 2.16.840.1.101.3.4.2.4)  = e0f20710-25d9-54ab-8325-ccf2d456ad0b

But I'm not sure if it's important.

danielmarschall commented 1 year ago

UUIDv5 requires two parameters: Namespace ID and Payload. So the notation UUIDv5(urn:oid:2.16.840.1.101.3.4.2.4) is incomplete.

The full notation of UUIDv5(OID 2.16.840.1.101.3.4.2.4) would be UUIDv5("6ba7b812-9dad-11d1-80b4-00c04fd430c8", "2.16.840.1.101.3.4.2.4"), but then the line becomes too long.

Edit: I have changed my proposal to UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4") . This should be more clear what I mean.

fabiolimace commented 1 year ago

The full notation of UUIDv5(OID 2.16.840.1.101.3.4.2.4) would be UUIDv5("6ba7b812-9dad-11d1-80b4-00c04fd430c8", "2.16.840.1.101.3.4.2.4"), but then the line becomes too long.

Yes, I noticed that the namespace parameter was implicit.

fabiolimace commented 1 year ago

The following UUIDs were created by using a UUIDv5 with the OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the OID identifying the hash algorithm. SHA2_224 = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4") = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"

I completely agree now. ❤️

--

P.S. It also breaks my implementation of UUIDv8 using SHA-256. 😢

danielmarschall commented 1 year ago

Yes, it breaks some implementations, including mine. But after all, Internet Drafts are supposed to change. :-) I really hope my proposal gets accepted, because I think it is perfect to use OIDs. They are unambiguous and so everyone can define their own hashspace IDs.

kyzer-davis commented 1 year ago

@danielmarschall, Your proposal of UUIDv5(NS_OID, "Hash_OID_NO_LEADING_DOT") works for the NIST ones that have OIDs. Are we to assume every cryptographic hashing function will have an OID?

Checking against your list earlier:

MD2               = "1.2.840.113549.2.2"
MD4               = "1.2.840.113549.2.4"
MD5               = "1.2.840.113549.2.5" (But probably just use v3)
TIGER/192         = "1.3.6.1.4.1.11591.12.2"
RIPEMD160         = "1.0.10118.3.0.49" or "1.3.36.3.2.1"
RIPEMD128         = "1.0.10118.3.0.50" or "1.3.36.3.2.2"
RIMEMD256         = "1.3.36.3.2.3"
WHIRLPOOL         = "1.0.10118.3.0.55"?
GOST3410-2012-256 = "1.2.643.7.1.1.3.2"
GOST3410-2012-512 = "1.2.643.7.1.1.3.3"
HAVAL-3-128       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-3-160       = "1.3.6.1.4.1.18105.2.1.1.2"
HAVAL-3-192       = "1.3.6.1.4.1.18105.2.1.1.3"
HAVAL-3-224       = "1.3.6.1.4.1.18105.2.1.1.4"
HAVAL-3-256       = "1.3.6.1.4.1.18105.2.1.1.5"
HAVAL-4-128       = "1.3.6.1.4.1.18105.2.1.1.6"
HAVAL-4-160       = "1.3.6.1.4.1.18105.2.1.1.7"
HAVAL-4-192       = "1.3.6.1.4.1.18105.2.1.1.8"
HAVAL-4-224       = "1.3.6.1.4.1.18105.2.1.1.9"
HAVAL-4-256       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-128       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-160       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-192       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-224       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-256       = "1.3.6.1.4.1.18105.2.1.1.1"
SNEFRU            = ???

RIPEMD may have two and SNEFRU does not have one that I can find? How would we handle something like that?


@fabiolimace

I just don't know which are the current canonical names for the SHA-2 family. For example, Wikipedia and Java use SHA-256 (with a dash), but not SHA2_256 (with a 2 and an underline).

I can change them to the NIST document items easy enough. I added the "2" so they were somewhat inline with SHA3 from a formatting perspective and I I swapped the "/" char for an underscore. Underscores were used because they matched the underscores used in the namespace items.

But I am not partial. I can change them to the following as defined by FIPS180-4 and FIPS202

SHA-224      = "...whatever we choose..."
SHA-256      = "...whatever we choose..."
SHA-384      = "...whatever we choose..."
SHA-512      = "...whatever we choose..."
SHA-512/224  = "...whatever we choose..."
SHA-512/256  = "...whatever we choose..."
SHA3-224     = "...whatever we choose..."
SHA3-256     = "...whatever we choose..."
SHA3-384     = "...whatever we choose..."
SHA3-512     = "...whatever we choose..."
SHAKE128     = "...whatever we choose..."
SHAKE256     = "...whatever we choose..."
danielmarschall commented 1 year ago

The new names according to FIPS180-4 and FIPS202 look good to me. I think "SHA-512/256" reads much better than "SHA2_512_256".


About algorithms with multiple OIDs, I would try to find the "official" ones. But I know that task can be hard and it might be ambigous.

About algorithms without known OID, I think this could be out-of-scope. Since the mechanism is optional, people would need to define own UUIDs, e.g. UUIDv1 or UUIDv4 for these hash algorithms.

I am not sure if my proposal 1 (that used algorithm names in a custom namespace, e.g. UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA-512/224") ) would be better. @kyzer-davis What is your opinion to my proposal 1 ? Many people and even implementations like PHP use hash algorithm names like "GOST", but there are so many GOST algorithms, that we do not know what is implemented, so the risk is that someone does this: UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "GOST")

fabiolimace commented 1 year ago

I found a list of OIDs here (extracted from github.com/openssl): https://version.cs.vt.edu/techstaff/linux-audit/-/blob/861ecd5cf6005a1bb1a16d840713f56c425f1039/ansible2/ansible/module_utils/crypto.py#L405

I also found OIDs for some GOST (GOvernment STandard, RU) digests in this document: RFC-9215: Using GOST R 34.10-2012 and GOST R 34.11-2012 Algorithms with the Internet X.509 Public Key Infrastructure.

The ASN.1 OID used to identify the GOST R 34.11-2012 hash function with a 256-bit hash code is:

id-tc26-gost3411-12-256 OBJECT IDENTIFIER ::= { iso(1) member-body(2) ru(643) rosstandart(7) tc26(1) algorithms(1) digest(2) gost3411-12-256(2)}

The ASN.1 OID used to identify the GOST R 34.11-2012 hash function with a 512-bit hash code is:

id-tc26-gost3411-12-512 OBJECT IDENTIFIER ::= { iso(1) member-body(2) ru(643) rosstandart(7) tc26(1) algorithms(1) digest(2) gost3411-12-512(3)}

Links:

EDIT: GOST OIDs are already in kyzer's list.

danielmarschall commented 1 year ago

It might be a bit off-topic, but I am very confused about the implementations in PHP.

~- There is the algorithm "gost" which is "GOST R 34.11-94" (OID = 1.2.643.2.2.9); I have verified it with test vectors. Wikipedia and other sources imply that the hash algorithm name "GOST" is describing the algorithm "GOST R 34.11-94". So, would be UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "GOST") unambiguous to the majority of users?~

~- And there is the "gost-crypto" hash algorithm where I do not understand what it does and how it can be identified (neither as algorithm name nor as OID). My software solution gave the hashspace ID UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "gost-crypto"), but I guess this is nonsense and ambigous.~

~So, algorithm names are tricky..~

(Edit: Found the solution)

fabiolimace commented 1 year ago

From reading Wikipedia, md_gost94 appears to be obsolete like MD5 or SHA-1, and md_gost12_256/md_gost12_512 the counterparts of SHA-2 or SHA-3.

GOST = government standard R 34.11-94 Streebog = government standard R 34.11-2012

Is that correct?

kyzer-davis commented 1 year ago

@danielmarschall, personally I like proposal 2 of the OIDs because they are "well formatted" that is they are a set of "numbers and a dots".

Proposal 1 has the challenge that SHA256, sha256, sha-256, SHA-256 all produce different hashes and proposal 2 removes that.

Proposal 2 has the challenges I listed but the points may be moot as many of the items we are discussing are algos nobody will likely ever use...

kydavis@ubuntu-web-server:~$ echo -n "SHA256" | sha256sum
b3abe5d8c69b38733ad57ea75e83bcae42bbbbac75e3a5445862ed2f8a2cd677  -

kydavis@ubuntu-web-server:~$ echo -n "SHA-256" | sha256sum
bbd07c4fc02c99b97124febf42c7b63b5011c0df28d409fbb486b5a9d2e615ea  -

kydavis@ubuntu-web-server:~$ echo -n "sha256" | sha256sum
5d5b09f6dcb2d53a5fffc60c4ac0d55fabdf556069d6631545f42aa6e3500f2e  -

kydavis@ubuntu-web-server:~$ echo -n "sha-256" | sha256sum
3128f8ac2988e171a53782b144b98a5c2ee723489c8b220cece002916fbc71e2  -
danielmarschall commented 1 year ago

points may be moot as many of the items we are discussing are algos nobody will likely ever use...

@kyzer-davis Are you referring to the small discussion(s) about HAVAL and GOST and my long OID list above? Don't worry, they were just part of my personal evaluation process to find out if proposal 1 or proposal 2 are better in regards to the Non-NIST algorithms, because you mentioned missing and ambiguous OIDs, so I was wondering if this is a serious issue or not. I don't propose that GOST, HAVAL, Tiger, ... get added to the RFC.

To avoid confusion in this large thread, here is my proposed text (Proposal 2):

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based
   UUIDs.

   The following UUIDs were created by using a UUIDv5 with the
   OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the
   OID identifying the hash algorithm.  This mechanism of generating a
   hashspace ID is OPTIONAL.  Any UUID can be used as a hashspace ID.

   SHA-224      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4")  = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.1")  = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.2")  = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.3")  = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.5")  = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.6")  = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.7")  = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.8")  = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.9")  = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.10") = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.11") = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.12") = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

Since the lines are too long for RFC, here is a variant with line breaks: (Unfortunately, the line breaks are very ugly)

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based
   UUIDs.

   The following UUIDs were created by using a UUIDv5 with the
   OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the
   OID identifying the hash algorithm.  This mechanism of generating a
   hashspace ID is OPTIONAL.  Any UUID can be used as a hashspace ID.

   SHA-224      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4")
                = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.1")
                = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.2")
                = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.3")
                = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.5")
                = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.6")
                = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.7")
                = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.8")
                = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.9")
                = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.10")
                = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.11")
                = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.12")
                = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

Another format that does not use the UUIDv5() pseudo-method and NS_OID constant (line breaks are still ugly):

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based
   UUIDs.

   The following UUIDs were created by using a UUIDv5 with the
   OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the
   OID identifying the hash algorithm.  This mechanism of generating a
   hashspace ID is OPTIONAL.  Any UUID can be used as a hashspace ID.

   SHA-224 (2.16.840.1.101.3.4.2.4)
                = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256 (2.16.840.1.101.3.4.2.1)
                = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384 (2.16.840.1.101.3.4.2.2)
                = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512 (2.16.840.1.101.3.4.2.3)
                = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224 (2.16.840.1.101.3.4.2.5)
                = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256 (2.16.840.1.101.3.4.2.6)
                = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224 (2.16.840.1.101.3.4.2.7)
                = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256 (2.16.840.1.101.3.4.2.8)
                = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384 (2.16.840.1.101.3.4.2.9)
                = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512 (2.16.840.1.101.3.4.2.10)
                = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128 (2.16.840.1.101.3.4.2.11)
                = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256 (2.16.840.1.101.3.4.2.12)
                = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

@kyzer-davis If you agree, can you please add one of these to a pull request? Thank you very much!

fabiolimace commented 1 year ago

Another way to demonstrate the hashspaces is to show a predefined list followed by the pseudocode used to generate the list. I find it (almost) impossible to have doubts about how the list was generated. Separating the list from the steps to generate it takes less "cognitive effort", in my opinion.

Predefined list of hashspaces:

   SHA-224     = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256     = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384     = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512     = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224 = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256 = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224    = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256    = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384    = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512    = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128    = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256    = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

Pseudocode to derive hashspaces from message digest OIDs:

   # array of message digest OIDs
   OID["SHA-224"]     = "2.16.840.1.101.3.4.2.4"
   OID["SHA-256"]     = "2.16.840.1.101.3.4.2.1"
   OID["SHA-384"]     = "2.16.840.1.101.3.4.2.2"
   OID["SHA-512"]     = "2.16.840.1.101.3.4.2.3"
   OID["SHA-512/224"] = "2.16.840.1.101.3.4.2.5"
   OID["SHA-512/256"] = "2.16.840.1.101.3.4.2.6"
   OID["SHA3-224"]    = "2.16.840.1.101.3.4.2.7"
   OID["SHA3-256"]    = "2.16.840.1.101.3.4.2.8"
   OID["SHA3-384"]    = "2.16.840.1.101.3.4.2.9"
   OID["SHA3-512"]    = "2.16.840.1.101.3.4.2.10"
   OID["SHAKE128"]    = "2.16.840.1.101.3.4.2.11"
   OID["SHAKE256"]    = "2.16.840.1.101.3.4.2.12"

   # function do derive hashspaces from message digest OIDs
   function hashspace(algo) { return UUIDv5(NAMESPACE_OID, OID[algo]) }

Note: the pseudocode is based on AWK syntax. Implementers can simply copy the pseudocode and change it to suit the target language syntax. If I was the implementer, I would appreciate it.

kyzer-davis commented 1 year ago

Got it @fabiolimace and @danielmarschall. I will go with the OID method for obtaining the Hashspace ID. aka "Proposal 2"

Don't worry about formatting, I will get that figured out. Could end up as some ascii, some table, etc.

PR will likely happen next week.

Finally, depending on how the discussion over in #144 shakes out one could possibly add a new hashspace ID to the IANA registry without needing a full on spec to do so. Just needs to be defined by the way we say in this doc and then added to that table. Name, OID, ID, Doc for Hash Algo would be the columns in my mind This would help when some next gen crypto comes out and somebody wants to define the hashspace for it. Much easier via an email template than a full on RFC. (Same goes for some legacy algo if somebody wanted to use it, update the registry and now anybody can leverage it.

LiosK commented 1 year ago

I'm somewhat concerned about this OID + UUIDv5 approach because:

I think v4 based IDs are simpler and safer.

fabiolimace commented 1 year ago

I think v4 based IDs are simpler and safer.

This is a question I've been trying to answer myself for a while: how good is a 160- or 256-bit truncated hash compared to a 128-bit random number?

I've tried a few times, but I always fail miserably because I don't have the statistical knowledge to give an answer.

I always end up, in my naive attempts, trusting in the principle of Saint Thomas: seeing is believing. However, I can't see any difference with my eyes.

However, I believe that hash-based UUIDs are still very useful for associating a binary or textual value with a relatively short ID in a permanent and (almost) univocal way.

EDIT: I crossed out the text because I realized I misunderstood the sentence. Please ignore. (but the question still remains)

LiosK commented 1 year ago

The crossed out question is a different topic but I think is a very good question, which neither do I have an answer to. Please take a look at several posts relating to FIPS stuff following my original post about the hash space approach.

danielmarschall commented 1 year ago

I'm somewhat concerned about this OID + UUIDv5 approach because:

  • With this approach, any future hash function will automatically get its hash space ID when it gets an OID. Technically, such a new hash space ID will have to get ratified through a formal process, but this approach does create expectations that such a new ID will be ratified. In this way future spec authors might lose control over the hash space ID definitions.
  • This approach creates a use case of v5 from now on, whereas v5 and SHA1 are no way recommended for future uses.

I think v4 based IDs are simpler and safer.

I can understand your concern that UUIDv5 is using a deprecated hash algorithm.

But I think it is very useful that the hash space is not just random, but connected with the algorithm. Let's imagine the case when someone wants to use a Non-NIST hash algorithm, e.g. HAVAL-3-128.

Imagine IANA does not have that hash listed. By using random UUIDv4, someone needs to choose/generate a hash space id, and IANA needs to add it. Maybe IANA even insists that a RFC is written that defines the hash space ID. But do you think every developer who wants to use a Non-NIST hash will contact IANA or even write a RFC?

A lot of algorithms have OIDs. This is important for some technologies like X.509. By having the hash space (optionally) be derivated from the OID means that two developers can hash using HAVAL-3-128, and since HAVAL-3-128 has OID "1.3.6.1.4.1.18105.2.1.1.1", both implementations output the same UUID. Without writing a RFC, without contacting IANA. (And yes, I know that some hash algorithms have an biguous OID or no OID at all. But my research showed that I the majority of algorithms has exactly one OID)

LiosK commented 1 year ago

In my opinion, such a new hash function must be registered through a formal process (by a separate RFC or IANA registry, I don't know) unless the new UUID RFC specifies the algorithm to derive a hash space ID in a normative manner. Otherwise, the de facto hash space ID crafted by future implementers will be put on an uncertain state. So far, the name-based v8 is just an example of v8 implementation techniques, and we will have no time to put this in the normative section. With this in mind, we shouldn't create any expectations related to the future hash space IDs. UUIDv4-based hash space IDs do require a formal process to ratify new hash functions, and accordingly give the full control over the UUID specification to the future spec authors to recommend one hashing algorithm and discourage another.

cbandy commented 1 year ago

My naive and scattered thoughts:

Would it be better to identify these hashspaces using v7?

LiosK commented 1 year ago

v7 works, so does v4, I think though.

EDIT: v4 is better because of its randomness. Hash space IDs are passed to another hash function so should be very different from each other.

chorman0773 commented 12 months ago

I saw the OID proposal above, and I'd like to second that.

This would allow 3rd parties can also define new Hashspace UUIDs, if they have an OID they can control (and hand out sub-OIDs from), which they can get from the IANA. It would also allow users of v9 to substitute a v5 UUID in out-of-band transport with simply the OID for the algorithm itself. The main risk of doing this, in my opinion, without a centralized registry is that one algorithm might end up with 2 different OIDs in different contexts. If this route is taken, there should be guidance to avoid anti-collisions.

LiosK commented 12 months ago

Since it's v8, any third party can generate a UUID and use it in their application as a hashspace ID for any hash function. Perhaps, we should expand the following statement in Section 6.5 to clarify that any user-defined UUID value may be used as a hashspace ID within an application context. This point is not sufficiently clear in the current draft, despite #132.

These MAY leverage newer hashing protocols such as ... or even protocols that have not been defined yet.

Within an implementation can the implementer do whatever they want, but a standard has to focus in the coordination of such implementations. What if SHA-4 has a parameter that is not expressed in the OID? What if a widespread implementation applies SHA-5 differently than expected? These circumstances may risk the future interoperability under the OID-based hashspace scheme. Plus, observing such a situation, future RFC authors might even avoid ratifying an OID-based hashspace ID because officially specifying the meaning of widely used hashspace ID can destroy the existing implementations.

kyzer-davis commented 12 months ago

Getting caught up on these longer threads after being out unexpectedly. I see there have been lots of discussions... As such, I will hold off on changes for the moment. We can aim for this as a topic on the interim call the chairs have requested.

mcr commented 11 months ago

If you do not have a datatracker.ietf.org login, please get one, as you'll need it for the virtual interim. That's the only barrier to participation. Slides uploaded to datatracker would also be appreciated.

kyzer-davis commented 11 months ago

@mcr "Slides uploaded to datatracker would also be appreciated." yeah, I will get some to the chairs this week!

mcr commented 11 months ago

@mcr "Slides uploaded to datatracker would also be appreciated." yeah, I will get some to the chairs this week!

if @danielmarschall or others still feel they want v9, then they also need to explain the proposal in a slide or two.

kyzer-davis commented 11 months ago

As per #147, Hashspace IDs were removed. Which would resolve this discuss item if merged. Please see the proposal in #147 and leave feedback on that topic there.