Open JordanMaples opened 5 years ago
gsl::narrow had the explicit design goal to be a "safe cast" either it succeeds without changing the value or it throws an exception (I.e. it had a wide contract). As such, throwing an exception is the right thing to do (similar to e.g. std::vector::at).
If you want to have a cast with a narrow contract (terminating when the input value can't be represented by the output type), narrow_cast would be the appropriate function (although I don't know if it currently uses GSL_REQUIRE in Microsoft's reference implementation).
If people weren't so afraid of regular libraries instead of header only ones, throwing an exception could be placed as an implementation detail into the cpp file and pulling in some heavy headers wouldn't be a problem. .
Just as very few developers use std::vector::at
, so they should generally avoid gsl::narrow
. The reason is that we want to stop confusing bugs with exceptional circumstances. It's very hard to imagine a situation where a narrowing conversion which loses information is not a bug.
In the terminology used in section 4.3.1 of P0709R3, an out-of-range error is likely to be a "Programming bug" and not a "Recoverable error".
Edit: I should have said: "It's very hard to imagine a situation where a narrowing conversion which overflows is not a bug." A narrowing conversion which results only in underflow is more often not a bug.
If you want to have a cast with a narrow contract (terminating when the input value can't be represented by the output type), narrow_cast would be the appropriate function (although I don't know if it currently uses GSL_REQUIRE in Microsoft's reference implementation).
Currently narrow_cast<>()
is explicitly specified to tolerate information loss:
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#note-204
double d = 7.9;
... The guidelines support library offers a
narrow_cast
operation for specifying that narrowing is acceptable and anarrow
("narrow if") that throws an exception if a narrowing would throw away information:i = narrow_cast<int>(d); // OK (you asked for it): narrowing: i becomes 7 i = narrow<int>(d); // OK: throws narrowing_error
This is also what Microsoft's GSL implements.
Perhaps it would make sense to add a third narrowing option that is specified to fail fast, tentatively named narrow_or_fail<>()
?
Please can we just remove reference to narrow
and narrow_cast
from the CCG? They are confusing and send the wrong message:
narrow_cast<int>(7.9)
and narrow_cast<int8_t>(200)
simply aren't in the same category of information loss, andSo if we get rid of narrow<>()
and narrow_cast<>()
(by removing them from the Core Guidelines and deprecating them in the GSL implementations), what do we replace them with?
I see the following desires behind these casts:
U* first = ...;
U* last = ...;
auto v = std::vector<T>(narrow_cast<std::size_t>(last - first));
narrow_cast<Y>(x) == x
, as has been discussed in #1206.Orthogonal to that, there are different types of numeric casts to be considered:
double
to float
almost always loses precision.cast<int>(1.2)
), or when an integer is not representable (cast<float>(123456789)
)?And I'm confident people more familiar with floating-point types could add more cases.
I'm not sure if the Core Guidelines should cover casts involving floats at all. Perhaps it's best to focus on integer casts? (Edit: much of floating-point casting is already handled by std::round()
and std::lround()
.) In that case, how about the following:
integral_cast<T>()
that is specified to fast-fail on information loss (addressing desire 3). (In practice, the cast would use Expects()
to guard against loss of information, so checks can be disabled or turned into __assume()
statements if desired.)try_integral_cast<T>()
which either returns optional<T>
or a struct
like to_chars_result
(addressing desire 2).integral_cast_or_throw<T>()
so as not to encourage undue use of exceptions. Users can build their own integral_cast_or_throw<>()
with try_integral_cast<>()
if they must.unchecked_integral_cast<>()
to satisfy desire 1, but admittedly I'd rather use a function-style cast in such cases.Regarding bugs vs recoverable errors: If my software has a bug that I don't know about and my end user encounters, how do I debug it? If narrow_cast throws an exception I can log an error or generate a crash dump or beg forgiveness. If it calls std::terminate my options are a lot more limited. In either case it's a bug and in either case my application crashes in an unpleasant way but with an exception I may not be able to recover but at least I may be able to respond.
Regarding casting between floating point types: I don't see any reason to treat loss of precision as different from sort-of-overflow. Every real value maps to a bit pattern in a float and (I think) every bit pattern maps to an infinite number of real values (excluding the NaNs but including the infinities). I don't see a fundamental difference between rounding 1.300000000000000044408920985 to 1.2999999523162841796875f and rounding 3e+58 to inf. In both cases the operation maps an infinite set of numbers to a different infinite set of numbers and in both cases the operation is irreversible.
Regarding bugs vs recoverable errors: [...]
This is a much more general issue. The Core Guidelines currently specify that the program be terminated if preconditions expressed with Expects()
are violated (cf. https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#gslassert-assertions); however, all implementations of the GSL support a configuration switch that makes Expects()
throw exception instead.
As far as narrowing casts are concerned, perhaps we can simply agree that casting a value that cannot be represented in the destination type is a precondition violation which can be enforced with the contract-emulating Expects()
macro.
Editors' call: Our view is that narrow
is like to_integer(string)
-- if the input contains an invalid value (e.g., to_integer("hello")
) then we should throw a run-time error. This means that narrow
should not use an Expects
, it is not a precondition violation.
Also, GSL.assert currently requires Expects
to terminate the program if the condition is false. This is intended to mean std::terminate
which permits a terminate_handler
to do things like log some final information before restarting the process to try again. There is nothing in the Guidelines today about Expects
throwing on contract violations.
@mbeutel +1 for integral_cast
. Not sure about the name but that behaviour is highly valuable.
Editors' call: The original question has been answered, but there is now a new question about integer casts and whether to make a special form of narrow/narrow_cast for them. Assigned to Bjarne and updated title to make the current topic more visible.
Why would we treat integral narrowing separately (differently?) from pointer and floating narrowing?
I think narrowing from, say, double
to float
and suffering failure on loss of precision is an extremely, um, narrow use case; I'd be hard pressed to come up with an example where this is desired.
However, narrow<int>(myfloat)
is arguably useful. So perhaps we shouldn't treat integers and floats differently?
Edit: I think the reason why mixing up integers and floats here feels odd is that they "simply aren't in the same category of information loss" as @johnmcfarlane puts it. Losing information when narrowing an integer simply constitutes an out-of-range error. But narrowing floats usually incurs loss of precision; and I'm not even sure whether exceeding the representable range of the destination type can be considered an out-of-range error if the target type can represent ∞.
Edit: Corrected underflow vs overflow nomenclature. Thanks johnmcfarlane
with regard to "the same category of information loss" that seems like a somewhat arbitrary categorization to me.
The current meaning of "narrow" for these checks seems to be that logically the value is precisely representable in both types and practically the value can round trip between the two types.
The proposed distinction is that there's a fundamental difference between the value in question falling between two representable values as in narrow_cast<int>(7.9)
versus the value falling outside the range of all representable values as in narrow_cast<int8_t>(200)
. Could the same argument not be made that there is a difference between positive and negative overflow? I suspect everyone would agree that there's no practical value in having distinct cast_without_positive_overflow<char>(int)
and cast_without_negative_overflow<char>(int)
but it's not obvious to me why this division is less relevant that the proposed one. You could have orthogonal cast_without_positive_or_negative_overflow
and cast_without_loss_of_precision
but what's the use case in which this division is useful?
As far as I can see float->float conversions cannot overflow (because of ∞), integer->integer conversions cannot lose precision, integer->float conversions cannot overflow but float->integer can lose precision or overflow. Consequently you'd need a third option cast_without_loss_of_precision_or_positive_or_negative_overflow
. This is what narrow_cast/narrow does now. Why separate out cast_without_positive_or_negative_overflow as a special subset of cast_without_loss_of_precision_or_positive_or_negative_overflow but not separate cast_without_loss_of_precision?
(Nitpick: loss of precision is underflow. You are perhaps using underflow to mean negative overflow and overflow to mean positive overflow.)
The categorization of narrowing into potentially overflow-inducing versus potentially underflow-inducing is far from arbitrary. The former always caries a risk of major error. The latter is often entirely acceptable.
And yes, a facility which casts from floating-point to integer types (and maybe vice versa) in which overflow is a contract violation (i.e. could be treated as UB or terminate as per GSL's Expects
configuration) would be just as desirable as a similar cast between integer types. So -1 for integral_cast
and +1 for cast_without_overflow
but with a different name, e.g. cast_in_range
.
As for providing a cast which is concerned only with underflow, I see no problem with that but it's unlikely to see as much use.
Good point. I agree that loss of precision is often fine while overflow is usually not so the distinction is meaningful.
To separate out these concerns it seems like a complete set would be:
narrow_cast
(no failures)narrow_cast_fail_on_overflow
(i.e. integral_cast
i.e. cast_without_overflow
)narrow_cast_fail_on_precision_loss
(or maybe you wouldn't bother with this one)narrow_cast_fail_on_overflow_or_precision_loss
(i.e. narrow
)converting integers to integers there's no precision loss so 1 and 3 are equivalent and 2 and 4 are equivalent (i.e. all you need is narrow_cast and narrow). converting integers to floats 1 and 2 are equivalent and 3 and 4 are equivalent (i.e. all you need is narrow_cast and narrow). converting floats to floats 1 and 2 are equivalent and 3 and 4 are equivalent (i.e. all you need is narrow_cast and narrow).
For converting floats to integers all 4 are different. Let's agree that the use case for 3 is pretty rare and ignore it. 2 is a method that truncates the value and fails if it's out of range. A good name for that is probably truncate (though it's contract is not that of std::trunc). This has the added benefit of being explicit about the direction of rounding and allows for alternatives like ceil/round. I have a suspicion (I've done only cursory testing) that this method would be equivalent to
gsl::narrow<someint>(std::trunc(somefloat))
If I'm correct about this then I think truncating the input before calling narrow would be a better way to express that you don't care about precision loss. This would render 2 unnecessary which would again mean you just need narrow_cast and narrow.
I'd recommend no mention of truncation or, indeed, any rounding mode. Truncation may be what occurs but not necessarily. The purpose of this API is not to control rounding but rather to express the intent to convert in a way which may lose precision.
The intent of this API has always been to help tools to warn users of potentially surprising precision loss with fewer false positives. The intent is not to control or specify how rounding occurs.
Just to make sure my meaning is clear, I'm not suggesting defining narrow_cast_fail_on_precision_loss
as gsl::narrow<someint>(std::trunc(somefloat))
. I'm asserting that a user of gsl::narrow
that cares about overflow error but does not care about precision loss could simply alter the input to gsl::narrow
with something like trunc (or the numerous alternatives that convert a float to a nearby integer value) which removes the precision loss before gsl::narrow
starts checking. narrow is not controlling rounding at all in this scenario.
float f = 127.1;
char a = gsl::narrow_cast<char>(f); //I don't want to check for loss of precision or overflow
char b = gsl::narrow<char>(f); // I want to check for loss of precision and overflow
char c = gsl::narrow<char>(std::trunc(f)); // I want to check for overflow but not loss of precision. I've decided that in this case the loss of precision should be truncation.
Advantages:
integral_cast<char>(127.1)
should be considered an overflow or not. 127.1 is larger than the largest representable value in the type (so should overflow) but after precision loss it fits ok (so shouldn't overflow).Disadvantages:
(Side note: https://en.cppreference.com/w/cpp/numeric/fenv/FE_round states "The current rounding mode does NOT affect the following: floating-point to integer implicit conversion and casts (always towards zero)" so if that's correct then it seems safe to assume narrow_cast from float to int will truncate but that's mostly tangential to what I'm trying to say).
Consider whether throwing is the correct way to get information about narrowing failure. See GSL PR: microsoft/GSL#761
For example: we are concerned about whether or not to pull header iostreams for more rich exceptions. The example information in that PR looks useful but if the narrow_cast did a fail fast the programmer would have the information w/o adding it in string form to the exception.