odin-lang / Odin

Odin Programming Language
https://odin-lang.org
BSD 3-Clause "New" or "Revised" License
6.1k stars 550 forks source link

Add `core:encoding/uuid` #3792

Closed Feoramund closed 4 days ago

Feoramund commented 1 week ago

This is the odin-uuid package I wrote a while back, updated to use the new context-based random generator, along with some minor documentation touch-ups. It implements RFC 4122 compliant unique identifiers with generators for versions 3, 4, and 5.

Kelimion commented 1 week ago

Incidentally, I think we can relatively trivially extend the supported UUID versions to also include v1 and v6, which per RFC 9562 allow replacing the MAC address with random bytes for privacy. v6 is v1 with the time bits reversed.

So a generate_v1 :: proc(mac: Maybe([6]u8)) -> Identifier? On platforms that have it, the caller can then use net.enumerate_interfaces to get a real MAC if that interests them, without having to complicate the UUID code.

Not that the PR should be held up for those. They can be added after merger.

Feoramund commented 1 week ago

One upside of making the Identifer be only a distinct type is that I was able to make the namespaces @(rodata) now, which is nice.

I'll have to give this RFC a read. I wasn't aware of it before.

Kelimion commented 1 week ago

One upside of making the Identifer be only a distinct type is that I was able to make the namespaces @(rodata) now, which is nice.

Indeed!

I'll have to give this RFC a read. I wasn't aware of it before.

I'm not surprised. It was published only last month.

Kelimion commented 1 week ago

Merge it, or do you want to tinker with it some more in light of the RFC update?

Feoramund commented 1 week ago

Merge it, or do you want to tinker with it some more in light of the RFC update?

I'm working on v6 at the moment, then I'll add another test or two and clean up. Might get it all wrapped up in the next few hours, depending. I may implement v8 just as a version/variant stamp on top of a user-provided u128/array slice for good measure.

If you could check over the v1 proc for me, that'd be great. The endianness had me confused for a bit, because I know they're supposed to be stored in big-endian, but the time fields are flipped and also separated, so I just resorted to doing it byte-by-byte for sanity's sake.

Kelimion commented 1 week ago

If you could check over the v1 proc for me, that'd be great. The endianness had me confused for a bit, because I know they're supposed to be stored in big-endian, but the time fields are flipped and also separated, so I just resorted to doing it byte-by-byte for sanity's sake.

Sure thing. I'll give it a thorough look tomorrow.

Feoramund commented 1 week ago

There's been some good suggestions brought up. I should be able to handle them today.

Kelimion commented 1 week ago

There's been some good suggestions brought up. I should be able to handle them today.

The consensus we came to within the team was to adopt the same strategy as core:crypto, and to add a core:encoding/uuid/legacy for the variants using MD5 and SHA1, themselves from core:crypto/legacy.

Then there's no need for a compile-time warning of any sort. We'd just have a large banner at the top of the file, same as in md5.odin.

e.g.

package encoding_uuid_legacy

import "core:encoding/uuid"
import "core:crypto/legacy/md5"

Identifier :: uuid.Identifier
(etc)

generate_v3_bytes :: proc(
    namespace: Identifier,
    name: []byte,
) -> (
    result: Identifier,
)

The v4 UUID doesn't need to go under legacy, but as @laytan observed, it probably should use runtime.random_generator_query_info and verify the random generator currently on context is .Cryptographic.

Feoramund commented 1 week ago

If we're restricting usage to cryptographic generators, I think that should be fine so long as I include an example of how to do that, since the new context rand API is still quite fresh.

I read in the newer RFC that they suggest you could make your own v8 with an up to date hashing algorithm as an alternative to v3 and v5. We could provide one of those too, with the stipulation that v8 is the "make up your own" ID and that ours isn't an official implementation. generate_v8_sha2 or such. I'd need to take a look at the crypto API again. I forget if it's possible to specify the algorithm by enum but last I looked, I think it was. If that's the case, generate_v8_hash could work, with an enum argument.

uuid/legacy strikes me as better than generate_insecure_v3, too.

Feoramund commented 1 week ago

@Kelimion I believe I've handled all the ideas brought up so far, other than where to put the package. I found myself accidentally typing core:uuid sometimes. Is that a better place for it?

Kelimion commented 1 week ago

@Kelimion I believe I've handled all the ideas brought up so far, other than where to put the package. I found myself accidentally typing core:uuid sometimes. Is that a better place for it?

I feel like that would set a bad precedent. The package is too narrowly focused for top-level billing. The top-level packages are either categories, or fundamental. I think of uuid as more of a core:encoding/uuid thing than something fundamental. No offence. :-)

Without a limiting principle like that we'd end up with 121 top-level packages within weeks, and that's just no way to organize anything.

Feoramund commented 1 week ago

That ought to do it.

Kelimion commented 1 week ago

That ought to do it.

The thing is that it's _nsec because it's "opaque" and it allows changing how it's represented in the future. Unlikely for that to happen at this point, but that's to deter users of the core library from poking at it directly. If we merge this PR, then the UUID package is part of the core library packages - not a downstream user - that can do so legitimately.

Feoramund commented 1 week ago

Unlikely for that to happen at this point, but that's to deter users of the core library from poking at it directly. If we merge this PR, then the UUID package is part of the core library packages - not a downstream user - that can do so legitimately.

That all makes sense to me. I tend to err on the side of caution with these sort of things in the aim of not introducing more of a burden to future maintenance.

Kelimion commented 1 week ago

Unlikely for that to happen at this point, but that's to deter users of the core library from poking at it directly. If we merge this PR, then the UUID package is part of the core library packages - not a downstream user - that can do so legitimately.

That all makes sense to me. I tend to err on the side of caution with these sort of things in the aim of not introducing more of a burden to future maintenance.

We can introduce an even lighter time.from_nanoseconds :: proc(nsec: i64) -> Time that's a barely disguised cast that'll inline to the same for people to use for "outside" epoch conversions and Time construction that'll be safe to use in the event of Time updates, but I'd consider that outside of the scope of this PR.

Kelimion commented 1 week ago

I've just added time.from_nanoseconds, so if the internal representation ever changes, that constructor will still hold up.

Feoramund commented 1 week ago

Not a bad idea at all. I'd have to rebase to get access to it, which will overwrite all the past commits. I think GitHub correctly tracks the differences in force-push rebasing, but I'm not 100% sure. I can do that here and now in this PR if you'd like.

Kelimion commented 1 week ago

Not a bad idea at all. I'd have to rebase to get access to it, which will overwrite all the past commits. I think GitHub correctly tracks the differences in force-push rebasing, but I'm not 100% sure. I can do that here and now in this PR if you'd like.

It should do, especially because it's a simple re-graft on top of the same branch and you're introducing an entirely new package. Up to you. Can also update to use the new call after merge if you want.

Waiting for Bill to give the new package his blessing and merge it. It has mine.

Feoramund commented 1 week ago

There we go. Looks like the PR survived intact.