std::data() and std::span support

BurningEnlightenment commented 5 years ago

C++17 introduced std::data() as a means to access the underlying memory block of contiguous container like entities in a generic way. Furthermore C++20 will introduce std::span which has a converting template constructor requiring std::data(cont) to be well formed (side note: we do satisfy the other requirements already). This would let me (and probably others) remove some special handling of uuid when doing IO.

Supporting it is as easy as adding the following two public member function to the uuid class:

BOOST_CONSTEXPR value_type* data() BOOST_NOEXCEPT { return data; }
BOOST_CONSTEXPR value_type const* data() const BOOST_NOEXCEPT { return data; }

jeking3 commented 5 years ago

Sounds reasonable to me.

mclow commented 5 years ago

I have no problem with a const version, but giving non-const access to the underlying data seems like it would open up for trouble. On the other hand, uuid provides non-const iterators, so I guess that's more or less the same.

Lastique commented 5 years ago

As a data point std::uuid proposal (see here and here) is moving towards making UUID binary representation opaque. This is a good change compared to Boost.UUID as it offers more room for optimization. Granted, this proposal is not accepted yet (to my knowledge), but it might still be beneficial to use that knowledge.

Although boost::uuid provides iterators, they are not the problem. They are pointers now, but they could potentially be something else, which would hide the underlying representation. The real problem is that boost::uuid::data is public. I suppose, you could argue that supporting std::data and std::span don't make things worse, but I still think that it would be better for Boost.UUID to move towards concealing its internal representation. Maybe deprecate public boost::uuid::data at some point and make it private after the deprecation period. std::data and std::span would make that effort impossible, unless you give literally no guarantees about the contents they return, which would not be useful.

So, in short, I'm not a Boost.UUID maintainer, but I would prefer Boost.UUID evolve in a different direction, which would make std::data and std::span not possible or practical.

BurningEnlightenment commented 5 years ago

@Lastique at least the current stduuid implementation offers a as_bytes() -> gsl::span<> method which exposes the internal representation. However, what optimizations will be enabled by an opaque representation?

Maybe deprecate public boost::uuid::data at some point

actually this would be also required for std::data because of the name clash.

Lastique commented 5 years ago

at least the current stduuid implementation offers a as_bytes() -> gsl::span<> method which exposes the internal representation.

I think, that was discussed in std-proposals list. I'm hesitant to speak for the participants of that discussion, but I got the impression that hiding internal representation was considered the right way to go, which would make as_bytes impossible in the current form.

However, what optimizations will be enabled by an opaque representation?

Most importantly, it will remove the need to deal with endianness. For example, if you look at the operator< implementation, it could have been more efficient if native endianness was allowed. Potentially, it would simplify conversions between uuid and strings when optimizing with SIMD. It will also allow to use types other than bytes for storage (e.g. natively supported larger integer types) and enforce alignment of the storage for better performance.

pdimov commented 7 months ago

I don't think we'll ever be able to offer data() because we already have a public member data and there's no clean upgrade path from here to there.

Lastique commented 7 months ago

We could deprecate the public data member and eventually make it private and rename. Users are able to access the UUID bytes with iterators.

One downside of this transition is that boost::uuid will no longer be POD, but it will still be a trivial type. Not much of a loss, IMO, considering that other desired changes would also require it (e.g. make the default constructor zero initialize the UUID rather than leaving it uninitialized).

pdimov commented 7 months ago

Changing existing uses of data to use iterators wouldn't be a mechanical replacement and will be error-prone. It's better to provide a replacement that works as-is.

I suppose we could add

using octets_type = uint8_t[16];

octets_type& octets() noexcept;
octets_type const& octets() const noexcept;

and then u.data can be mechanically replaced with u.octets(), minimizing the potential of surprises.

pdimov commented 7 months ago

Users are able to access the UUID bytes with iterators.

I'm not sure if we want to encourage that any further; P0959 (correctly IMO) points out that uuid is a value, rather than a container, so iterator-based access is arguably a mistake.

as_bytes seems like the correct interface here, but unfortunately, std::span is C++20.

Although if we move away from an iterator interface, we shouldn't need data() either. :-)

Most importantly, it will remove the need to deal with endianness.

No, I don't think this would be wise. UUIDs are well defined portable entities with standard (also portable) string representations, so the operations depending on the native endianness would be horrible design (and isn't really required, as platforms generally have bswap and movbe/movle instructions nowadays.)

Lastique commented 7 months ago

Well, even if UUID is considered as a distinct object, the APIs where it is interpreted as an array of bytes are not uncommon. boost::uuid should provide a way to access UUID bytes, if not through iterators then through some other API. data() is a sensible API for that. I mentioned iterators earlier because that's what we provide now, and those could serve as a stopgap while we transition to data().

the operations depending on the native endianness would be horrible design (and isn't really required, as platforms generally have bswap and movbe/movle instructions nowadays.)

One example of such operation that was mentioned is operator<, which could be more efficient with native-endian representation. bswap and movbe/movle don't help as they don't support vectors. (There is pshufb that could be used to swap bytes within a vector, but (a) it would require an extra vector constant and a memory access, which is why I didn't use it in the current implementation, and (b) I don't think architectures other than x86 have an efficient equivalent, although I may be missing it.)

pdimov commented 7 months ago

Yes I know that operator< can be made more efficient on x86 at the expense of yielding an unspecified result. I just don't think it should be. operator< should give the portable, documented lexical order.

An efficient less operation, if present, should be provided under a different name, although I'm unsure of its utility.

pdimov commented 7 months ago

(b) I don't think architectures other than x86 have an efficient equivalent

I know next to nothing about ARM, but from a cursory look, one should be able to synthetize a byteswap for __uint128_t in a vector register with a sequence of vrev and vtrn.

pdimov commented 7 months ago

Although I doubt that the result will be more efficient than the scalar code (https://godbolt.org/z/5KoPPoP8n).

Lastique commented 7 months ago

Yes I know that operator< can be made more efficient on x86 at the expense of yielding an unspecified result. I just don't think it should be. operator< should give the portable, documented lexical order.

No, my point was that the result could be portable and the operator< could be efficient as well, if the representation was allowed to be platform-specific. And it doesn't preclude us from exposing a portable, big-endian byte sequence as well - whether through iterators or something else.

pdimov commented 7 months ago

I suppose you could store the bytes in little-endian order, yes, which would make data() / octets() / as_bytes() access impossible and make the iterators reverse_iterator<uint8_t*>.

That's an interesting option, but I'm not sure whether it's worth it because it penalizes everything else in order to benefit operator<.

Lastique commented 7 months ago

Although, I admit, data() doesn't fit as that API for accessing portable big-endian byte sequence.

Lastique commented 7 months ago

I suppose you could store the bytes in little-endian order, yes, which would make data() / octets() / as_bytes() access impossible and make the iterators reverse_iterator<uint8_t*>.

That's an interesting option, but I'm not sure whether it's worth it because it penalizes everything else in order to benefit operator<.

Yes.

In my experience, I use UUIDs as keys quite often, and serialize them to bytes less often, so the tradeoff makes sense to me. And even then, where I serialize UUIDs I know I will be reading them on the same kind of machine, so I wouldn't even need the big-endian byte sequence, platform-specific representation would be fine. But this last part is specific to my use case, I suppose.

Lastique commented 7 months ago

I think, we could have a member function like std::array<std::uint8_t, 16u> portable_bytes() const noexcept that would return the big-endian byte sequence. Generating this array could be optimized as well - it's either byte swap instructions or simple copy of the internal representation.

We could also have a similar initializing constructor doing the reverse.

pdimov commented 7 months ago

In my experience, I use UUIDs as keys quite often, and serialize them to bytes less often, so the tradeoff makes sense to me.

That's true for most people, but today, no sane person should/would use std::set<uuid> when boost::unordered_flat_set<uuid> exists. The performance difference would surely be enormous.

pdimov commented 7 months ago

It occurs to me that another advantage of having opaque representation and value-based accessors would be to enable better constexpr support, as reinterpret_cast is not allowed there so one needs to play games with is_constant_evaluated in the implementations of the comparisons.

pdimov commented 7 months ago

Something else I noticed - having data be uint8_t[16] causes MSVC to activate stack protection when there are local uuid variables, which occasionally penalizes simple functions (e.g. the straightforward implementation of is_nil as return *this == uuid{};.

BurningEnlightenment commented 7 months ago

Something else I noticed - having data be uint8_t[16] causes MSVC to activate stack protection when there are local uuid variables.

IIRC stack protection generally activates for POD types (as defined by MS ABI) larger than 8 bytes, i.e. uuid would be affected regardless of internal layout. Am I misremembering something?

In retrospect, I see this as a misguided feature request on my part. I've come to the conclusion that accessing the internal representation for small-ish value types offers very little added value compared to explicit serialization functions.

I think, we could have a member function like std::array<std::uint8_t, 16u> portable_bytes() const noexcept that would return the big-endian byte sequence.

I suppose a complementary void serialize_to(std::span<std::byte, 16U>) const noexcept would be appreciated by people trying to serialize uuids which are part of larger structures.

pdimov commented 7 months ago

No, stack protection is only activated by arrays on the stack, not POD types.

I think that having uint8_t const* data() const noexcept is still preferable to having a public member data, so we need to figure out a way to get there, somehow. (Especially if we keep iterators, since the elements are at present contiguous, it's much better to have data() than not so that generic code can take the contiguous path.)

The alternative is I suppose removing both the public data member and iterator access, but this will be too disruptive to existing users.

One option is to have a variable data of a "magic" type that has both a conversion to uint8_t (&)[16] and operator(), so that data() can be used in a backward-compatible way. I'm going to try that out one of these days.

BurningEnlightenment commented 7 months ago

No, stack protection is only activated by arrays on the stack, not POD types.

In the specific case of is_nil it seems to me that the stack protection correlates with the usage of SSE intrinsics in operator==: https://godbolt.org/z/EoW715M8M

So I went to the MSVC docs and they state (s. learn.microsoft.com):

A buffer overrun security check is performed on a GS buffer. A GS buffer can be one of these:

[...]

A data structure whose size is more than 8 bytes and contains no pointers.

I think that having uint8_t const* data() const noexcept is still preferable to having a public member data.

To clarify my previous statement: I do agree with this part. However, I think the alternative (remove container interface) would be cleaner if a deprecation and upgrade path over multiple releases was feasible. OTOH I do appreciate backwards compatibility :/

pdimov commented 7 months ago

I think the code I was trying was https://godbolt.org/z/nbjGKvrMW, and while "msvc x64 latest" does manage to optimize out the security cookie here, "x86 latest" (https://godbolt.org/z/xE7Y3nPG8) and "x64 19.29" (https://godbolt.org/z/oMT8qd6T1) do not.

I suppose we shouldn't worry too much about that, then.

(SSE2 doesn't seem to cause any trouble either: https://godbolt.org/z/P4nsz5Gns, https://godbolt.org/z/Pvbsdc6GP.)

After giving this a bit of thought, it seems to me that what we need to be driving towards is something like

struct uuid
{
    using octets_type = std::array<std::uint8_t, 16>;

    static uuid from_octets( octets_type const& ); // or maybe by value, rsi:rdi
    static uuid from_uint64( std::uint64_t high, std::uint64_t low );
    static uuid from_uint128( __uint128_t ); // if available

    octets_type as_octets() const noexcept;
    std::pair<std::uint64_t, std::uint64_t> as_uint64() const noexcept;
    __uint128_t as_uint128() const noexcept; // if available
};

We'll annoy the people who want to poke directly into the octets (u.data[3] |= 0xF0; or something), but I don't think this is much of a loss.

However, one disadvantage of the above is that you won't be able to serialize an array of uuid with a single call, which is going to be a significant performance hit provided that anyone needs to do that, which I'm not sure about.

pdimov commented 7 months ago

One example of the above optimization that comes to mind is N3980 which defines a function hash_append that feeds a byte representation of the object being hashed to a hash algorithm. In uuid's case, it can directly feed the entire object to hash_append, and if it's marked as (or inferred to be) "contiguously hashable", an array of uuid will automatically activate the optimization and be processed in a single hash_append call.

Lastique commented 7 months ago

I do not think from_octets(octets_type const&) is a good interface. This method should accept a pair of pointers/iterators or iterator_range instead of an array so that the user doesn't have to do the extra copy when he deserializes an UUID from a larger stream of bytes. And the from_* methods should be explicit constructors instead, I think. You normally add these named factory methods when there are multiple constructors that are ambiguous, which is not the case here.

I also don't think constructing or producing 64/128-bit integers is good, as an UUID is not an integer or pair of integers. I think, such operations would be too confusing. (Also, std::pair is a poor choice anyway, as it necessitates <utility> and provides poor first/second names for its members.)

pdimov commented 7 months ago

The ability to construct a thing from other things does not imply that the first thing is the same as the other things. For example, you propose explicit uuid( uint8_t const* first, uint8_t const* last ) (which I don't quite like because UB when last-first != 16), but uuid is not a pair of pointers.

first/second is not an issue because the intended use is auto [high, low] = u.as_uint64();.

pdimov commented 7 months ago

Interestingly, that's probably the reason for my intuitive preference for from_ functions over constructors - in addition to being symmetric with the as_ functions, they also make it clear that uuid is not one of those things, it can just be created from them.

BurningEnlightenment commented 7 months ago

Another thought: There should be an explicit warning about std::bit_cast if the internal representation diverges from the canonical one.

(which I don't quite like because UB when last-first != 16), but uuid is not a pair of pointers.

C++20 users would probably appreciate an explicit uuid(std::span<std::uint8_t, 16> octets) regardless of the overloads available for earlier language revisions.

Lastique commented 7 months ago

The ability to construct a thing from other things does not imply that the first thing is the same as the other things. For example, you propose explicit uuid( uint8_t const* first, uint8_t const* last ), but uuid is not a pair of pointers.

UUID is a sequence of bytes, and the constructor would reflect this pretty well. It's the next best thing we have after std::span and iterator_range.

first/second is not an issue because the intended use is auto [high, low] = u.as_uint64();.

This use case will also work if you return a struct { uint64_t high, low; }; but it will provide better names if one wants to use the struct directly and removes the need for <utility>.

C++20 users would probably appreciate an explicit uuid(std::span<std::uint8_t, 16> octets)

Yes, an overload with std::span would be desirable. But the same effect should be achievable with older C++ versions, hence my request for a pair of iterators or iterator_range.

pdimov commented 7 months ago

As far as standard headers go, <utility> is very cheap and avoiding its inclusion is generally pointless. But we could define our own struct, of course.

boostorg / uuid

std::data() and std::span support #96