Closed danakj closed 9 months ago
We already have implicit constructors for these types then the incoming type will not lose any data, so the as<T>()
cast would be strictly for cases where data is lost. We also have T::from_unchecked(unsafe_fn, x)
to do this without checking that truncation is happening.
So I think maybe the choice to be made here is if that should be considered an unsafe operation, or it's fine to lose data as long as it's explicit. Since no UB occurs, this probably means it's "dangerous" from a "your software is likely to have bugs" POV but not "unsafe" from a "the compiler will ruin your life and you will struggle to debug this in production" POV.
In this case, how to go from i64 x
to i32 y
:
i32 y = i32::from_unchecked(x); // Unchecked usually comes with `unsafe_fn`...
i32 y = x.as<i32>();
i32 y = i32::from_lossy(x);
i32 y = x.cast<i32>();
noting that we already have Choice::as<Tag>()
to get the active Tag
member out of Choice.
An important sharp edge that's left unanswered right now is conversion between floats and integers.
We have no i32::from(f32) or similar, no i32::from_unchecked(f32), no f32::from(i32), etc.
We should look at what Rust defines for these conversions, because at best it's messy and there's tradeoffs to be made.
f32 implements From<u8>
and From<u16>
and From<i8>
and From<i16>
as these conversions are lossless.
https://doc.rust-lang.org/std/primitive.f32.html#impl-From%3Ci16%3E-for-f32
https://doc.rust-lang.org/std/primitive.f32.html#impl-From%3Cu16%3E-for-f32
https://doc.rust-lang.org/std/primitive.f32.html#impl-From%3Ci8%3E-for-f32
https://doc.rust-lang.org/std/primitive.f32.html#impl-From%3Cu8%3E-for-f32
And it supports as
which allows lossy truncation-casting between primitive types.
https://doc.rust-lang.org/stable/reference/expressions/operator-expr.html#type-cast-expressions
Casting from integers to floats:
Casting from an integer to float will produce the closest possible float *
if necessary, rounding is according to roundTiesToEven mode ***
on overflow, infinity (of the same sign as the input) is produced
if integer-to-float casts with this rounding mode and overflow behavior are not supported natively by the hardware, these casts will likely be slower than expected.
*** as defined in IEEE 754-2008 §4.3.1: pick the nearest floating point number, preferring the one with an even least significant digit if exactly halfway between two floating point numbers.
Converting floats to integers:
Casting from a float to an integer will round the float towards zero
- NaN will return 0
- Values larger than the maximum integer value, including INFINITY, will saturate to the maximum value of the integer type.
- Values smaller than the minimum integer value, including NEG_INFINITY, will saturate to the minimum value of the integer type.
C++ does not specify what happens when you convert from a primitive float to integer that can't be represented.
Currently the integer types are From
my_i64 as u8
achieves in Rust.In Rust integers, From is only implemented when it can be done losslessly and without panic. We have the ctors for that already because it is important for conversion from primitives and literals, since they won't be type deduced to our numeric types.
Regarding integers, we should be informed by the PUN effort in chromium. When types are currently specified incorrectly, due to implicit conversions of primitive types, the fixes involved explicit casting (bit casting/truncation) but it was easy to do it wrong and produce bugs, while also being hard to convince people to just use checked_cast when u want to preserve the value and not the bit pattern.
Another thought is how might rust have done this if they did not add the as keyword. Through a trait but they dont use From/Into for truncating conversions. Maybe a BitFrom trait (avoiding the type deduction of BitInto?) would be appropriate and possibly useful in other contexts too which makes it sound nice.
u8::bit_from(-40_i64) // preserves bits u8::try_from(-40_i64) // preserves value
Or maybe AsBits
FromBits could be the receiving side of AsBits then still with from_bits(x)
FromBits is requires to support casting from primitives, as they can't be extended with AsBits.
Then, if we have FromBits, we could provide sus::as_bits
Ah, but if we want to cast back to primitives too, then we need sus::as_bits<unsigned long>(an_i32)
or some such with an extension point to provide as_bits for things that are not themselves FromBits. This is a similar point to what I've been considering a while with Into, if into<Y>(x)
should support things beyond From<decltype(x), Y>
.
Actually, if we take it all outside of the class then one concept is sufficient for both directions. It could plausibly be named FromBits or AsBits, though I think the latter reads a bit more correctly with the template parameter. Either way the type being constructed is the type parameter.
// With from_bits.
unsigned int a = 2;
auto b = sus::from_bits<i16>(a);
// With as_bits.
unsigned int a = 2;
auto b = sus::as_bits<i16>(a);
The concept would be satisfied by a template specialization, like with sus::ops::Try
and TryImpl
, but with an as_bits() function more like into() but (intentionally) without type deduction:
template <class To, class From>
struct AsBitsImpl;
template <class To, class From>
concept AsBits = requires (const From& from) {
{ AsBitsImpl<To>::from_bits(from) } -> std::same_as<To>;
};
template <class To, class From>
To as_bits(const From& from) { return AsBitsImpl<To>::from_bits(from); }
Which allows restricting on types that be bit-converted into X:
i32 add_10(AsBits<i32> auto x) {
return sus::as_bits<i32>(x) + 10;
}
Staring at naming some more and wanting to follow good guidelines (https://rust-lang.github.io/api-guidelines/naming.html) for consistency, AsBits should be ToBits, and it should be sus::to_bits
Instead of a strict bit preservation between int and float, it would be better to follow the rules of https://github.com/chromium/subspace/issues/221#issuecomment-1529201960 assuming they are ~compatible with static_cast but with less UB.
Have defined float->integer casting to match them, with well defined behaviour for NAN and OOB values. For in-bound values it has the same defined behaviour as static_cast did.
integer->float looks.. harder =) for another day.
Also.. if it's not strict bit preservation then I want a better name than ToBits probably. But to_bits is nice and short, something that represents the lossyness of it all.
Current best idea: ToNear<T, F>
and sus::to_near<T>(x);
.
It's super non-trivial to match rust's behaviour in conversions from int to float. C++ leaves it as implementation defined behaviour for which direction to round toward a representable floating point value, while Rust follows IEEE specs better and has a defined strategy of roundTiesToEven
.
But also it's probably problematic to do something other than what static_cast does anyway because a) it would be slower and b) it would make converting code from static_cast to something safer (no UB) suddenly get risky.
So to_bits() is now roughly defined as doing a static_cast but with defined behaviour for edge cases instead of UB.
This is all done now in https://github.com/chromium/subspace/pull/289 except we need a better name than ToBits
and sus::to_bits<T>(x)
since it's a general cast/conversion not a bitwise-cast.
Current candidates: | Concept | Function | Notes |
---|---|---|---|
ToNear |
to_near<T>(x) |
More accurate for float than int conversions | |
ToLossy |
to_lossy<T>(x) |
Strong code review smell. Too strong? as isn't so uncommon in Rust |
|
ToIsh |
to_ish<T>(x) |
Bad use of capital i which looks like lowercase L? But good memeing | |
CoerceTo |
coerce_to<T>(x) |
does it imply lossiness? | |
Coerce |
coerce<T>(x) |
shorter than coerce_to<T>(x) but less clear? |
|
ToApprox |
to_approx<T>(x) |
Wordier version of "near"? | |
Transmogrify |
mog<T>(x) |
mog is not so discoverable, but good Calvin & Hobbes vibes |
Not listed above is sus::cast<T>()
which I think is what we should name what is currently sus::mog<T>()
. It's going to ease the learning curve a lot to use the same name.
Other conversions that we're missing:
https://doc.rust-lang.org/reference/expressions/operator-expr.html#type-cast-expressions
The
as
keyword in Rust will convert between types without panicing, and can introduce truncation.We could provide
3_i32.as<u32>()
orsus::as<u32>(3_i32)
(the latter feels weird).This is a lot like static_cast in C++ which is also a notorious cause of bugs... do we want to provide this in place of
i32::from_unchecked(unsafe_fn, x)
which already does a static_cast?