Clarification requested for Expects/Ensures contract violation behavior

isocpp / CppCoreGuidelines

The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++

http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines

Other

42.94k stars 5.44k forks source link

Clarification requested for Expects/Ensures contract violation behavior #1561

Open JordanMaples opened 4 years ago

JordanMaples commented 4 years ago

There have been discussions recently in #1512 and microsoft/GSL#831 regarding contract violation behavior. Would the editors mind clarifying the desired behavior for contract violation and updating GSL.assert as necessary?

The confusion stems from a potential contradiction between an Editor's call in issue #1512 and a note in the GSL.Assert section for Expects.

@hsutter's post from #1512:

Editors' call: GSL.assert currently requires Expects to terminate the program if the condition is false. This is intended to mean std::terminate which permits a terminate_handler to do things like log some final information before restarting the process to try again. There is nothing in the Guidelines today about Expects throwing on contract violations. Clarification: Expects should terminate, not abort, just to be clear.

Note from the Expects section of GSL.Assert:

// Expect in under control of some options (enforcement, error message, alternatives to terminate)

Microsoft's GSL implementation removed the alternative violation behaviors (throwing & unenforced) with PR microsoft/GSL#831. This change also had the side benefit of preventing potential ODR violations from misconfiguring contract violation behaviors.

hsutter commented 4 years ago

Editors call: We're in a halfway state where we had hoped Standard C++ would add contracts but the standard has not yet done so. We definitely do want gsl::span bounds check to unconditionally be enabled and unconditionally fail-fast on violation.

However, I.6 also recommends Expects for general use, some of which may want a continuation mode, we welcome example cases of such uses on this thread. If we need to support both we may need to have an Expects for general use that can be configured to continue, but also an Expects_always_fail_fast for gsl::span to use.

mbeutel commented 4 years ago

However, I.6 also recommends Expects for general use, some of which may want a continuation mode, we welcome example cases of such uses on this thread.

(Edit: you asked about a continuation mode, and I realize that throwing isn't continuation, but the throwing mode is perhaps the most prominent example of an alternative violation behavior, so I hope discussing it here is also appropriate.)

Let me first state the obvious: GSL_THROW_ON_CONTRACT_VIOLATION was very useful for testing precondition checks. Getting rid of this option forced MS-GSL to abandon new hotness Catch2 for old and busted GoogleTest because the former doesn't support death tests.

I also use throw-on-contract-violation when I write Python bindings for my functions for interactive use. A REPL environment blurs the distinction between programmer and user that is used to motivate fail-fast on precondition violation; having the REPL process terminate when passing wrong arguments is rather inconvenient.

If we need to support both we may need to have an Expects for general use that can be configured to continue, but also an Expects_always_fail_fast for gsl::span to use.

Giving gsl::span<> special treatment feels a little awkward. But perhaps the reasoning behind that can be generalized.

Hypothesis: I may want to use throw-on-contract-violation for the precondition checks in my own code, but I never want throw-on-contract-violation for 3rd-party code.

If we find that the hypothesis holds, perhaps it makes more sense to have Expects()/Ensures() recognize scopes:

#include <gsl/gsl_assert>

namespace my_lib {

#define GSL_CONTRACT_SCOPE MY_LIB

inline void my_func(int i)
{
    Expects(i >= 0); // can be configured to throw by defining `MY_LIB_THROW_ON_CONTRACT_VIOLATION`
    ...
}

#undef GSL_CONTRACT_SCOPE

} // namespace my_lib

(Cf. https://github.com/gsl-lite/gsl-lite/issues/213 which proposes scoped contract checks for gsl-lite.)

reddwarf69 commented 4 years ago

We definitely do want gsl::span bounds check to unconditionally be enabled and unconditionally fail-fast on violation.

Giving gsl::span<> special treatment feels a little awkward.

libstdc++'s std::span bounds check is not unconditionally enabled. It's actually disabled by default and only enabled if _GLIBCXX_ASSERTIONS is defined. It unconditionally fail-fast on violation, though. https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/std/span;h=f658adb04cf04259035cc95fe17ffbcdea0fbf45;hb=HEAD#l284

libc++'s std::span bounds check is not unconditionally enabled. It's actually disabled by default and only enabled if _LIBCPP_DEBUG is defined. Until 8.0.0 it had _LIBCPP_DEBUG_USE_EXCEPTIONS to let it throw exceptions on contract violation, but that doesn't exist any more. A non returning custom handler can still be set via the std::__libcpp_debug_function pointer (https://github.com/llvm/llvm-project/blob/master/libcxx/include/span#L304).

The Microsoft STL std::span enforcement seems to be configurable via _CONTAINER_DEBUG_LEVEL (https://github.com/microsoft/STL/blob/master/stl/inc/span#L604)

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0122r7.pdf says: "All accesses to the data encapsulated by a span are conceptually range-checked to ensure they remain within the bounds of the span. What actually happens as the result of a failure to meet span’s bounds-safety constraints at runtime is undefined behavior. However, it should be considered effectively fatal to a program’s ability to continue reliable execution. This is a critical aspect of span’s design, and allows users to rely on the guarantee that as long as a sequence is accessed via a correctly initialized span, then its bounds cannot be overrun.

As an example, in the current reference implementation, violating a range-check results by default in a call to terminate() but can also be configured via build-time mechanisms to continue execution (albeit with undefined behavior from that point on)."

So I agree giving gsl::span a special treatment is strange.

I don't have a strong argument for throwing on a precondition failure. Arguably Catch2 should simply support death tests, and AFAIK the author wants to add that support (from 2015: https://github.com/catchorg/Catch2/issues/553#issuecomment-164373144).

But I don't think the specification should mandate what GSL implementations do regarding enforcement. a) Leaving it unspecified is free, has no cost to anybody. b) The performance difference doesn't make a difference to me. But it may actually make a difference to somebody who may want the checks enabled when running tests, not in production. c) Without the option to disable the checks I'm going to continuously get resistance during reviews to any use of Expects because of potential performance issues. The end result is that some functions will lose the preconditions and problems that would have been catch during tests will now reach production.

pgroke-dt commented 4 years ago

We're developing a tool that among other things does APM (application performance monitoring). Which means it's running inside arbitrary customer processes. One thing we cannot do there is terminate() - we must not bring the customer's application down. So if GSL unconditionally uses terminate(), then we can't use it (or at least we'd have to use a modified version).

I think that defensive programming should be avoided as much as possible, but I'm afraid in our case it's not possible. Having some missing or corrupted traces is preferable by far to bringing down the application. Of course letting the execution continue beyond the failed assertion is also not a good option because it would mean risking corrupting arbitrary memory. So throwing was the best compromise for us - for release builds. There's no problem with terminating in debug builds.

BTW this is not only a problem with span. There's the same problem with stuff like not_null or any contract checks in general. We currently use our own contract checking macros which throw on violation, and so far this has worked well for us. Though we do have a huge number of unit tests, getting really close to 100% is almost impossible for a large projects that also get's changed often. So the argument that any contract violations should be found and fixed during development/testing and therefor terminating is OK doesn't really work for us.

ps: I think it's implied by what I wrote, but just to clear: having a configurable Expects would not help us as long as there's still an Expects_always_fail_fast that's used by span.

MikeGitb commented 4 years ago

@pgroke-dt: How do you actually verify that your tool doesn't bring down the process in some other way (e.g. by dereferencing a nullptr or letting an uncaught exception buble up to a noexcept function)? I assume by diligent testing, code reviews and static analysis tools. Why can't those methods not be used to ensure that there are no contract violations that could trigger the terminate? And why are you assuming, that a bug in your code (e.g. a out of bounds access) will only corrupt your traces and not the production data of the host application?

I mean, if you can't use the gsl then your can't use it - its not like there aren't other implementations out there or that it would pe particularly hard to implement the tools yourself. But an explicit call to std::terminate in a contract check is really not the only way a bug in your application could terminate a process or cause much worse damage.

MikeGitb commented 4 years ago

Personally I don't like that the question of how Ensures and Expects work gets always mingled with the behavior of gsl::span.

If it is a desired and specified property of gsl::span that it will not allow out of bounds access (i.e. gsl::span is supposed to be a "safe" alternative to std::span), then giving an invalid index to span::operator[] is not UB or out of contract any more than it is for std::vector::at() and hence should not be checked via contract checking mechanisms in the first place. Instead it should use a plain if(invalid(idx)){/* do something*/ }.

pgroke-dt commented 4 years ago

@MikeGitb

I assume by diligent testing, code reviews and static analysis tools.

Yes.

Why can't those methods not be used to ensure contract violations that would trigger the terminate? And why are you assuming, that a bug in your code (e.g. a out of bounds access) will only corrupt your traces and not the production data of the host application?

You usually have far more places where something could throw than "noexcept" boundaries. At least that's the case in our code. So making sure no exception propagates to a "noexcept" boundary is a lot easier than making sure every single line of code is 100% safe regarding contract violations. Especially error handling/corner cases are hard to cover in unit-tests. And while static analysis tools are good, they all produce false negatives and some of the unfortunately also produce a lot of false positives. Both can lead to mistakes.

When it comes to (not) dereferencing null pointers, we use our own functions/macros for checking those. Not everywhere, but where it makes sense. And as I already wrote, those throw exceptions.

Re. corrupting the production data of the host application: if we were to simply let execution continue after a failed assert, that would potentially be a big problem. We don't do that, we throw instead.

I also want to say I see less of a problem with span than with something like not_null -- especially the not_null version that has an implicit conversion ctor. The problem I see there is that developers will misunderstand it and actually feel kind-of "safe" when seeing the not_null opposed to seeing it as a red flag. Plus of course you don't see it at the call site at all. Having the ability to implicitly construct a not_null and have that implicit conversion lead to a std::terimnate is something that scares me. A lot.

shaneasd commented 4 years ago

I use not_null for function signatures to indicate the contract of the function (this parameter must not be null, the return value will not be null). However somewhere along the line I'll have a technically potentially null pointer that I'm pretty sure isn't null that I have to convert to not_null (implicitly or explicitly). I do so with the compiler flag set that makes it not throw. This is because my error reporting mechanism, which is intended to give useful diagnostics when the application terminates unexpectedly, relies on catching exceptions at the top level. If I dereference null I get a useful diagnostic. If I convert null to not_null I get a slightly more readable diagnostic. You could debate whether I should be relying on an exception from null deference since it's technically undefined behavior so technically it may not do it consistently but practically it works for my compiler right now.

Assuming that I am unable to overhaul my error reporting mechanism (though I think this is probably the ideal solution, pragmatism aside) and assuming that I want to continue to get useful diagnostics (I do), when I dereference null then if gsl doesn't support not_null throwing then it doesn't support my codebase.

As such I'm in an awkward situation where I'm either stuck on this version of gsl forever or (as @MikeGitb suggests) I have to write my own version of gsl::not_null. In the worst case this might mean writing my own version of gsl if I want to avoid naming conflicts (I could call mine something different but I may want to use tooling that relies on the name). This second option would violate http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#p13-use-support-libraries-as-appropriate in my opinion. Technically it doesn't because technically the library doesn't do what I need but I think the microsoft gsl implementation (and any other distributed gsl implementation) should be designed to be the one true gsl because having 10 libraries that all do essentially the same thing is bad as is having 1000 private implementations.

Naturally you have to draw the line somewhere but it feels like removing the ability to throw from not_null is too draconian. Putting aside my own anecdote, two pieces of evidence I raise are:

The microsoft gsl originally allowed throwing from not_null (I think it's been removed now). This suggests that someone working on gsl thought it was a good idea. I would take that to indicate that probably several people agreed and are now using it.
The microsoft gsl had to completely replace its unit testing framework in order to remove support for throwing from not_null. I think this is probably quite representative of the situation faced by many codebases with a change like this. The microsoft gsl is fairly small and fairly new. If even it is forced to change (beyond the obvious) to support this change then to me that's evidence that a lot of code bases will.

MikeGitb commented 4 years ago

I also want to say I see less of a problem with span than with something like not_null -- especially the not_null version that has an implicit conversion ctor.

I wholehartedly agree that that implicit conversion is problem. I think the original Idea was to have a static analysis tool that would stop you whenever it could not prove that gsl::not_null was constructed from a non-null ptr, in which vase you'd be forced to put a manual check in front of it.

foo(gsl::not_null<Bar*>)

foo(new Bar()); // Fine, new never returns nullptr
foo(get_potential_nullptr()); // static analysiss tool throws an error

auto* t = get_potential_nullptr();
if( t == nullptr) throw
foo(t); // fine

However, in practice thats probably rarely how this is being used.

pgroke-dt commented 4 years ago

@MikeGitb I don't know what the original plan was. I only saw a discussion about making the ctor explicit - and the decision was to keep the implicit not_null and add another class with an explicit ctor. IIRC the main reason was to allow people who already used the implicit ctor to update without having to change a zillion lines of code.

With our product it's similar: We're already using GSL and would like to update. Mostly because of the now-unsigned size type in span, because we had to add a lot of casts to silence compiler warnings with the old signed index type. Only now we can't, because of the terminate thing. Of course we can use our own fork/re-implementation, but ... well that's also not ideal. Especially if analyzers start to understand GSL constructs it would be preferable to use the "original" GSL.

We currently have GSL configured to throw on contract violation. Which is kind-of OK with the implicit not_null. But updating to 3.0.0 ... well, the unconditional terminate is really too scary to me.

johnmcfarlane commented 4 years ago

I am currently stuck on GSL 2.x because 3.0 violates the zero-overhead principle and does not cater for high-performance applications. A contract violation should not happen in bug free code. Yet the user is expected to pay for bugs they don't write with a branch on every contract check.

I've measured the difference and GSL 3.0 introduces a noticeable code bloat as these asserts can no longer be compiled out with the GSL_UNENFORCED_ON_CONTRACT_VIOLATION flag. Basically, GSL has lost its NDEBUG.

@shaneasd I would consider a fork of GSL which reintroduces GSL_UNENFORCED_ON_CONTRACT_VIOLATION (and GSL_THROW_ON_CONTRACT_VIOLATION if you really think that's a good idea, which I personally do not) for users who want to use GSL in conjunction with Clang-Tidy and the CCG but who don't want their performance to suffer as a result.

shaneasd commented 4 years ago

@johnmcfarlane I think we may be arguing slightly different things. I can definitely see a use case for a contract check which triggers in debug but not in release (or whatever other @defined dictates which behavior you want). I think bug free code is probably extremely rare so I don't think it should be the only use case catered to but I could see that in high performance code it's an assumption you have to make to maximize performance.

When you say the user are you referring to the end user of the software or the user of a library? I guess in either case their ideal would be that the software works without any bugs and thus requires no contract checks but the code I work with sadly does not achieve this (using cpp core guidelines and gsl is one way to try and stop it getting worse). As such for my use case error handling is an important feature.

With this in mind it seems like there are at least three kinds of users:

those that want a build with no contract checks at all because they don't want to pay the cost (@johnmcfarlane , if I understand correctly)
those that want a build which terminates on contract violation (thereby avoiding undesirable properties of exceptions)
those that have to use exceptions because their existing code (e.g. unit tests, error handing) assumes exceptions (Me and @pgroke-dt if I understand correctly).

Naturally a fork is an option but I don't need to support high performance applications in my fork and @johnmcfarlane doesn't need to support exceptions in his. So either we each go our separate ways which is a bad option or we create an alternative and "better" gsl which also seems like a bad option to me.

johnmcfarlane commented 4 years ago

@shaneasd

When you say the user are you referring to the end user of the software or the user of a library?

The library user.

I guess in either case their ideal would be that the software works without any bugs and thus requires no contract checks but the code I work with sadly does not achieve this (using cpp core guidelines and gsl is one way to try and stop it getting worse). As such for my use case error handling is an important feature.

(By error handling, I assume you mean bug catching here.)

I would caution against drawing a sharp distinction between 1) disabling checks in bug-free code and 2) enabling checks in buggy code. All code has bugs, none are welcome and trapping helps everybody eliminate them. But enabling contract checks doesn't somehow trap all the bugs.

And the run-time cost is noticeable. Here's an example of the efficiency hit you see moving from 2.1 to 3.0. The example's contrived, but it's representative of actual penalties you can expect to see in code using GSL. I saw my modest program grow by a KB or two when I switched up.

To be clear: I don't want to stop anyone from choosing to pay this cost.

Naturally a fork is an option but I don't need to support high performance applications in my fork and @johnmcfarlane doesn't need to support exceptions in his. So either we each go our separate ways which is a bad option or we create an alternative and "better" gsl which also seems like a bad option to me.

Given GSL 2.1 supported both our use cases, I don't see why separate forks would be necessary. And it's not something I want to do either. But you did suggest dropping GSL and I couldn't resist replying as I've been thinking the same thing recently! I would suggest to track GSL but with GSL_UNENFORCED_ON_CONTRACT_VIOLATION, GSL_THROW_ON_CONTRACT_VIOLATION and GSL_TERMINATE_ON_CONTRACT_VIOLATION (default) reintroduced.

shaneasd commented 4 years ago

Sounds like we're in agreement about the problem and possible solutions. Ideally I'd like the editors to state a definitive position before considering something so drastic as forking though. Currently I'm satisfied with the version I'm using (2.0.0 by the look of it) but eventually I'd want an upgrade path.

elshize commented 4 years ago

I don't have anything to bring to discussion regarding Expects/Ensures at the moment, but regarding span, wouldn't it make sense to have safe-by-default access but also have an "unsafe" part of API for performance reasons?

span[idx] // safe
span.unsafe_at(idx) // unsafe

On the other hand, one can just do the following for performance optimizations:

*(span.data() + idx)

Regarding throwing, is there a reason why std::span (or gsl::span for that matter) doesn't have a throwing at() member function just like std::vector does?

daviddetweiler commented 4 years ago

I agree with @elshize for clearly-marked "unsafe" members for span. Though we could all hope that compilers are smart enough to optimize-out inlined checks that are proveably unnecessary, the unfortunate reality is that code as simple as a pre-checked for loop is "too complicated," (at least for MSVC) for the optimizer to figure out the check won't ever trigger. Having the concept of being able to dip into unsafe code when needed would be a massive boon to high-performance cases, and a necessary workaround for when the optimizer makes bad decisions.

johnmcfarlane commented 4 years ago

@elshize @daviddetweiler a guiding design principle behind C++ is that no programmer should pay (in terms of run-time cost) for a feature that they do not use.

When used correctly, there is nothing unsafe about unchecked [] operation in gsl::span, std::vector or similar APIs (or even in C-style arrays). To make programmers pay for always-on error checking which will never be useful in their correct code violates the zero-cost principle. As for providing alternatives like at or unsafe_at, why should I have to write code that is harder to read and maintain just in case you write a bug?

Additionally, throwing an exception in response to a bug is just plain wrong. It's unfortunate that the GSL library runs with this poor design decision but lets not make the situation any worse or try to normalise it.

If you think there's a bug in your code which might result in out-of-bounds access, fix your code. If you're not sure, use static analysers, dynamic analysers, fuzzers and -- most importantly -- write tests. If you think that an index might be out of range in legitimate, but exceptional, circumstances, use gsl::at. If you think that an index might be out of range in legitimate, and common, circumstances, write the logic to handle that. If you think believe are bugs in your code but aren't prepared/able to fix them, perhaps leave instrumentation enabled (e.g. compile with a sanitizer enabled).

But please don't make others pay for the bugs they didn't write.

daviddetweiler commented 4 years ago

@johnmcfarlane As it turns out, my comment on MSVC was a bit out-of-date, and it's now smarter about it. gsl::at gets reliably optimized out for cases when the index is provably never out-of-range. My big gripe with gsl::span is, and continues to be, that it's impossible to sidestep poor optimization with it. When the optimizer works properly, the contract checks only get compiled in places where they're necessary (and this would be the exceptional circumstance example you gave). But of course, optimizers aren't always the brightest, and we should have the ability to bypass the checks, but I don't think it should be the default when the most common case is that the optimizer elides them in cases where they're provably false. It's not that I want to force people to write unsafe_* everywhere, it's that I think the option should be available and clearly visible in code that needs to beat the optimizer.

johnmcfarlane commented 4 years ago

@daviddetweiler firstly we should agree that we're focussing on bugs here: run-time conditions which absolutely cannot occur in correct code and for which the appropriate course of action is to fix the code.

Given that, a run-time check is a very useful developer tool for testing the correctness of the code. However, it's simply not possible for the optimiser to consistently remove such checks. Therefore they cannot be mandatory in software which observes the zero-overhead principle.

(and this would be the exceptional circumstance example you gave).

I'm referring to circumstances in which throwing of an exception would be appropriate. That excludes circumstances where a bug has been detected at run-time.

MikeGitb commented 4 years ago

The designers/authors/maintainers have decided that preventing out of bounds access should be a fundamental property of the gsl::span type. From Herb's first comment:

We definitely do want gsl::span bounds check to unconditionally be enabled and unconditionally fail-fast on violation.

And I think that is a perfectly valid design choice for a library that wants to make programming in c++ safe (r). If the design tradeoffs don't fit a particular usecase then it might be best to simply use a different type (e.g. std::span, gsllite span, home grown solution ...). Just as it is more efficient to use std::list in some scenarious and std::vector in others and neither is violating the zero overhead principle..

I said it before, my only real gripe about it is that the contract mechanism is used to implement (not just verify) part of the API specification. If gsl::span guarantees termination on out of bounds access, then it doesn't "Expect" or "Require" a valid index but instead has a wide contract that has defined behavior for any index, which is the very definition of not having a precondition.

That aside, add my +1 for unsafe_* accessors. They don't allow anything that wasn't possible before (*(data()+i)) but are easier to read and grep for and give the chance to insert a truly optional contract check.

elshize commented 4 years ago

I agree with @MikeGitb, it's clear to me that the goal is to have safe access by default.

As for providing alternatives like at or unsafe_at, why should I have to write code that is harder to read and maintain just in case you write a bug?

@johnmcfarlane From your comments, I figure it will be hard to convince you, but my view is that unsafe operations should be more verbose, one should be forced to do a little extra work and therefore be very mindful before doing something potentially unsafe. Mind you, this is all under assumption that gsl::span is designed to be safe. I also don't buy the argument to test and fuzz instead, or that "I don't make bugs" so why should I pay? Vulnerabilities in very well tested software are continuously discovered, and a big chunk of them are related to illegal memory access, including out of bounds writes. And in the end, as mentioned above, there are other implementations that are more suitable for unsafe, speed-optimized use, such as std::span.

@johnmcfarlane has a point that most of the random access operations by their nature won't be able to be optimized out, and that's the point. However, there are some obvious cases when you could do none or fewer checks, mostly for iterating. The most basic case is just iterating over the entire span:

for (auto elem: span) { ... }

This should be no more expensive than a regular array, since it will desugar to iterators.

If you need to only iterate over a subspan, then it should be done with:

for (auto elem: span.subspan(a, b)) { ... }

You pay only for one check, which makes sense because you can't be sure a and b are within the range. If you want to do it safely with an unsafe random accessor, you'd still have to check the bounds.

For more complicated access, when you know that you're in the bounds but don't do a sequential access, you should probably use the unsafe_at. Say, you have a permutation array and you want to access your elements in that order.

for (auto idx: permutation) {
    // safe because permutation elements are within the bounds
    debug_assert(idx < span.size());  // maybe even add assert?
    auto elem = span.unsafe_at(idx);
}

I said it before, my only real gripe about it is that the contract mechanism is used to implement (not just verify) part of the API specification. If gsl::span guarantees termination on out of bounds access, then it doesn't "Expect" or "Require" a valid index but instead has a wide contract that has defined behavior for any index, which is the very definition of not having a precondition.

@MikeGitb I think this is the main issue of interpreting what contracts are, and people don't seem to agree.

I'm of the opinion that a program should terminate on out of bound error. For cases when we can't allow the program to crash, and we somehow have a way to recover from that, we could use a throwing version, I think.

MikeGitb commented 4 years ago

@MikeGitb I think this is the main issue of interpreting what contracts are, and people don't seem to agree.

I'm of the opinion that a program should terminate on out of bound error. For cases when we can't allow the program to crash, and we somehow have a way to recover from that, we could use a throwing version, I think.

I don't have a problem with program termination if you try to access out of bounds. What I'm saying is that if that is the guaranteed behavior this should be implemented via if(i>=_size) { std::terminate } and not a Expects( i < _size )

I don't actually think there is a big question about what a contract is. e.g. the terms wide an narrow contracts are well established. The big discussion is around what should happen if a contract gets violated and if they should always be checked in the first place, which is precisely why I would not couple a behavior that is part of the API specification to a contract checking mechanism that has been subject to change/user configurable and is likely to change in the future.

elshize commented 4 years ago

I don't actually think there is a big question about what a contract is. e.g. the terms wide an narrow contracts are well established. The big discussion is around what should happen if a contract gets violated and if they should always be checked in the first place, which is precisely why I would not couple a behavior that is part of the API specification to a contract checking mechanism that has been subject to change/user configurable and is likely to change in the future.

This is actually what I was referring to, wasn't very clear. And I do agree with your take on the out-of-bounds check for span, we should just do a check instead of executing Expect imo. (The comment about terminating was just a general comment related to the previous conversation here, just wanted to express my opinion.)

gdr-at-ms commented 4 years ago

It is the express design and intent of gsl::span to provide bound safety by fast failing on out-of-bound access. For cases where people want a different behavior -- e.g. blissfully proceeding with out of bound access -- many implementations of std::span, including MSVC STL, GCC/libstdc++, Clang/libc++ provide those functionalities. GSL span is to provide that unambiguous bound check access. That also helps with code review, and auditing when root causing vulnerabilities. Reporting from the trenches after deploying to the wild, we are happy to share that the current behavior of gsl::span has helped thwart several vulnerability attempts in various components, precisely because of the fail fast on out-of-bound access..

johnmcfarlane commented 4 years ago

That would be fine except that is means that gsl::span is no longer suitable for some safety-critical and performance-critical applications. That in turn would also be fine except the CCG promotes general use of GSL, and gsl::span in particular. It follows that projects which are restricted to C++14 (much of the automotive industry for example) are put in an awkward situation because C++14 doesn't have std::span.

Also, it means that gsl::span and std::span are subtly -- but significantly -- different types. However, they share the same name and appear (at first glance) very much to follow the same intent. This sends a confusing message.

shaneasd commented 4 years ago

Is it reasonable to assume that gsl::span and std::span do the same thing? My first instinct was yes. However it occurs to me that there's a lot of precedent for the stl taking inspiration from utility libraries like gsl and there is also precedent (I'm thinking of boost::optional) for not being faithful to the original implementation (preferring something better, by some metric, over something consistent). As such I think perhaps it is unreasonable to assume two libraries will apply the same meaning to the same name.

In particular the solution in this case seems like it would be to rename one rather than change the meaning of one and renaming would be a pretty big breaking change.

BjarneStroustrup commented 4 years ago

If the option to throw disappear, I will have to stop recommending using that version for the reason mentioned below.

On 9/5/2020 12:13 PM, jheaff1 wrote:

I too am concerned about GSL 3.0.0 dropping the capability of throwing on contract violation. For safety-critical systems, terminating is simply not an option. It doesn’t matter how well a developer thinks they have tested their code, there is never 100% certainty that bugs exist in production software.

Therefore, being able to throw on a contract violation provides a safety net, allowing the application to stop critical processes and die gracefully or retry etc. I do not trust myself nor my fellow developers to write completely bug-free code so that we are 100% sure that contract violations can not occur.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/isocpp/CppCoreGuidelines/issues/1561#issuecomment-687584517, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTNGNEI2A23PGFE2PQBD73SEIFNBANCNFSM4KNOSNDQ.

jheaff1 commented 4 years ago

I had deleted my comment (referenced by @BjarneStroustrup) as after I posted it I realised that one could use a custom std::terminate handler to gracefully stop processes in an application before then calling std::exit or std::abort.

However, after further reflection I think that a developer should indeed be able to choose how contract violations are handled (terminate vs throw). Usage of a custom std::terminate handler bypasses the invocation of destructors that have been written to stop critical processes gracefully.

Consider a simple calculator app. An Add method may have preconditions such that the addition of the two inputs should not exceed the numerical limit of a 64 bit signed integer. If the user were to add two numbers which violate the contract, should the calculator app terminate? I wouldn’t say so. I would argue that it should tell the user that they entered numbers that are too large.

Enforcing termination on contract violation seems too drastic to me; perhaps I misunderstand the intended usage of contracts.

jwakely commented 4 years ago

The calculator example seems like a very bad use of preconditions. The contract should allow larger numbers, check for overflow, and report an error. You say yourself that you want the user to be informed, so then it's not a precondition.

jheaff1 commented 4 years ago

Really? I have often seen in online tutorials/papers a square root method used as an example to demonstrate contracts, whereby a precondition is that the input is more than 0.

Again, consider the calculator example. If the user input -1 into a square root function, the app shouldn’t just terminate.

As far as the sqrt function is concerned, it should just state its preconditions. It is the choice the higher level application developer to decide whether contract violations should cause termination or not, which highlights my concerns around GSL only supporting termination on contract violation.

jwakely commented 4 years ago

If the function to calculate the result has such a precondition, the app needs to validate the inputs before calling that function. If you want to handle bad input and tell the user there's a problem, you need to write code to do that. Informing the user is a defined feature of your program. Precondition violations are undefined behaviour. Relying on a specific response to undefined behaviour to implement a guaranteed feature of your program is foolish. You would be building defined behaviour on top of undefined behaviour.

It is the choice the higher level application developer to decide whether contract violations should cause termination or not

No, it is the responsibility of the higher level application developer to meet the contracts of the functions they are using. If that requires defensive checks to ensure you don't pass zero to a sqrt function, you need to do those checks, not just call the function anyway and expect it to tell you there was a precondition violation.

MikeGitb commented 4 years ago

The main problem I have with turning contract violations into exceptions via a config macro is that introducing new exceptions into a 3rd party library is usually not something you want to do after the fact. Suddenly all kind of functions can no longer be noexcept, which in turn has effects on performance, control flow, test coverage, the result of meta functions (e.g. is_nothrow_copy_constructible) and in c++17 even the type of the function. So unless all libraries that (my application is using) have been designed and tested from the start to be compatible with both modes, I'd never expect my application to become more robust or safe if I just turned on that behavior. So for safety critical systems that would be a no starter to begin with.

What's making this worse is that this is almost always a global decision, unless individual libs/modules take great care to not put any of the contract checks into a header file (directly or indirectly) - and that isn't even considering the ODR violations you might or might not get away with in practice.

If 100% of the c++ code that uses the gsl is under my control, then I see the value in being able to tailor how contract violations are supposed to be reported to the needs of my application. However, in that case, I'd probably just fork the gsl library and rewrite the macros to suite my purpose anyway.

Finally, as jwakely wrote: A precondition violation is a bug in your program. A contract check is not a substitute for validating end-user input (would you use assert in a product shipping to the end-user to validate user input?) but a defense against a programming error. You might still not want to terminate your program everytime you detect a bug at runtime, but in a bug-free program, there should be no way whatsoever that a precondition gets violated no matter the user input (the user in those case may also be another 3rd party program sending some malformed data via IPC, network or the filesystem)

jwakely commented 4 years ago

P.S. I am not stating my position on terminate vs throw (even though I do have an opinion), I am simply saying that the calculator examples are very bad examples for this topic, maybe based on a misunderstanding of what preconditions are.

"I need to handle bad input, so I want to be able to violate preconditions" is extremely confused thinking.

jheaff1 commented 4 years ago

I see, thank you for the clarification. I must have misunderstood the intention of contracts.

If an application was so concerned about stopping critical processes upon std::terminate being invoked, it could do so in a custom std::terminate handler. Therefore, as per your point about violations being bugs rather than allowable errors, it makes sense for violations to call std::terminate rather than throw.

There is also @johnmcfarlane’s point regarding performance, which is also a concern for me. The GSL library can no longer be configured to compile-out assertions in production builds.

MikeGitb commented 4 years ago

There is also @johnmcfarlane’s point regarding performance, which is also a concern for me. The GSL library can no longer be configured to compile-out assertions in production builds.

I think I already wrote it somewhere: I think the problem here is that a guaranteed check (gsl::span not allowing out of bounds access) is implemented via contract checking macros. If the specification for gsl::span::operator[](size_t i) guarantees program termination (or an exception for that matter) for i>=size(), then that is (imho) part of the contract and passing an i>=size() is no longer a violation of the contract.

Now I can agree with the design decision that gsl::span::operator[](size_t i) should always validate it's inputs (just like std::vector::at()), but that should not mean that I'm not able to turn off any contract checks in general.

BjarneStroustrup commented 4 years ago

Yes. Preconditions are not a substitute for input validation. Rather, a precondition can be a check that you did the input validation correctly.

On 9/6/2020 12:58 PM, Jonathan Wakely wrote:

The calculator example seems like a very bad use of preconditions. The contract should allow larger numbers, check for overflow, and report an error. You say yourself that you want the user to be informed, so then it's not a precondition.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/isocpp/CppCoreGuidelines/issues/1561#issuecomment-687756375, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTNGNEQUAB7X2TBQTXGUBDSENTMFANCNFSM4KNOSNDQ.

elshize commented 4 years ago

I fully agree with @MikeGitb and @BjarneStroustrup, if the intent is to have safe access with runtime bound check, then span shouldn't use contracts for that. A somewhat separate issue what to do on out-of-bounds access (throw or terminate), but shouldn't be dependent on the contracts.

Of course, one common thing in both cases (span and contracts) is that if we decide to unconditionally always terminate in either case, there's that problem of certain systems that really need to stay up no matter what.

As for span, I really like how Rust approaches it with slices, that is there's a terminating [], then get method that returns an optional, and finally unchecked_get that is memory unsafe (no checks are performed). I suppose in C++ the get method would be throwing at method. Although you might consider whether it would make sense to just have a throwing [] but the again some systems don't want to use exceptions and still fail on an unsafe access...

And as for the contracts, does anyone know if there's anything close to consensus as to what the standard contracts should be? As far as I understand, the reason they were pulled out of C++20 was that there was no agreement on whether it's enforcement should be done always or only as a debug assert, is that right? Or am I remembering incorrectly? I definitely heard both opinions but I'm not sure if that's in fact the topic of main contention.

BjarneStroustrup commented 4 years ago

There are a very wide spectrum of opinions on what contracts are for and how the notion is best supported. You can even find contradictory opinions from a single author as opinions change over decades.

That said, I'd like to retract my somewhat bombastic statement about preconditions and exceptions. The agreement among the CG editors is to wait to see what the contracts study group comes up with before doing a thorough revision of our treatment of contracts. Also, I should not post in a hurry while on vacation. Sorry.

Anyway, preconditions are not a good way to do input validation.

On 9/6/2020 1:22 PM, jheaff1 wrote:

Really? I have often seen in online tutorials/papers a square root method used as an example to demonstrate contracts, whereby a precondition is that the input is more than 0.

Again, consider the calculator example. If the user input -1 into a square root function, the app shouldn’t just terminate.

As far as the sqrt function is concerned, it should just state its preconditions. It is the choice the higher level application developer to decide whether contract violations should cause termination or not, which highlights my concerns around GSL only supporting termination on contract violation.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/isocpp/CppCoreGuidelines/issues/1561#issuecomment-687761815, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTNGNABEF4MWC2UY7LC4XDSENWINANCNFSM4KNOSNDQ.

shaneasd commented 4 years ago

I'm a little confused. It seems like there's an inconsistency between the view expressed in http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Ri-expects and the view being presented here (particularly by @BjarneStroustrup ). That rule, as I read it, says that anything that is a precondition should be expressed as such by using Expects() but it sounds like there is not a consensus among the editors as to what Expects should actually do. There also seems to be a lack of clarity as to what exactly constitutes a precondition in some of the more obscure edge cases.

However it seems like there is useful advice to offer here as it seems like everyone agrees that some things (like input validation) should definitely not be treated as input validation.

Should the rule be less prescriptive and simply say something like "Clearly differentiate between preconditions of a method and the logic of the method"? Possible counter arguments I can think of:

Custom precondition macros would be too hard to write correctly. I guess this depends what you want the macro to do. I note that the microsoft gsl contains some neat tricks but they might be unnecessary for many users. I don't see any harm in the rule suggesting the microsoft solution as an option.
It's hard for to decide what a custom precondition macro should actually do. Clearly this is true since there's no agreement on what gsl Expects should do but given that it doesn't seem like gsl Expects has any advantage here.
Tools might be better able to understand gsl Expects than custom preconditiion macros. This argument may have merit but I'm not sure if there are any actual examples and I wonder if it could be mitigated by simply naming the custom macro EXPECTS.

Possibly this could be rolled into http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#i5-state-preconditions-if-any

Naturally the same applies to http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#i8-prefer-ensures-for-expressing-postconditions

Of course this doesn't address the question of what gsl::span should do on contract violation.

reddwarf69 commented 4 years ago

Should the rule be less prescriptive and simply say something like "Clearly differentiate between preconditions of a method and the logic of the method"?

I would agree https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Ri-expects is too prescriptive. Mainly because I don't think GSL's Expect is specially good. It was rejected for the standard because of its reliance on macros, but the Bloomberg assert macros John Lakos presented in https://www.youtube.com/watch?v=tz2khnjnUx8 are a superior alternative. I'm not saying the GSL should have something like Bloomberg assert macros (maybe it should, maybe not), but nobody should be thinking "should I use this superior option in violation of the C++ Core Guidelines?".

The GSL Expects suffers from:

It's not namespaced. Being a macro it should have been GSL_Expects() to avoid issues.
It forces you to handle all the pre-conditions in the same way: the ones that increase a function runtime by 1%, which you may happily use in release builds, and the ones that double the runtime which you may only want enabled when running the tests.

jwakely commented 4 years ago

@RedDwarf69 I agree, but the right way to do contracts in C++ is still an open question. Once there is more clarity I would expect GSL to be updated.

That said, I agree that a macro without a "namespace" prefix that isn't all UPPERCASE is ghastly.

eric-hughes-tiledb commented 2 years ago

I come to this topic acting as the implementer of a library that others will use. It's not my decision to make about whether termination is appropriate or not. In one case I know of it's not only not appropriate, it's against expressed policy for a library to terminate. My need here to allow the user of the library to make the decision about appropriate run-time behavior. Much of the discussion here assumes that the author of the code also has authority to decide what the policy is. That's simply not always the case for library authors.

To be explicit, the fixed behavior of GSL 3.0.0 in this regard makes it unsuitable for use as-is in our library. It does not allow configuration for termination behavior.

There's an original muddiness in thinking about implementations of program correctness that seems to be at play here. It was expressed well here:

That rule, as I read it, says that anything that is a precondition should be expressed as such by using Expects() but it sounds like there is not a consensus among the editors as to what Expects should actually do. There also seems to be a lack of clarity as to what exactly constitutes a precondition in some of the more obscure edge cases.

Preconditions were originally conceived of as analytic tool for using formal logic to reason about the behavior of programs. The formal logic was not just predicate logic, but included first-order logic quantifiers (for-all, there-exists) and sometimes other non-first-order quantifiers (number-of). Somewhere along the way, many got in the habit of forgetting that quantifiers could be in precondition expressions. I can only surmise the causes, but I have a sense that it's because predicates can be readily be evaluated in code and quantified statements cannot.

Preconditions that can be evaluated are only a subset of preconditions that can be expressed. It's a rather profound conceptual mistake to conflate the two. And that mistake is at the heart of CCG I.6. You simply can't express the full variety of preconditions used in analysis in an expression that evaluates to bool. This mistake also informs the contract proposal in the C++ standards committee, but not as badly; at least it has an "axiom" designation, but still only allows non-quantified expressions.

As currently stated, I consider CCG I.6 to be Bad Advice. If I "prefer Expects() for expressing preconditions" then there's preconditions that I won't be expressing at all because they don't fit into the language. It's possible to salvage I.6, though, by qualifying its utility: "Prefer Expects() to specify which preconditions might be validated at run-time". With this change, it would be clear that (1) whether predicates should be evaluated at all, and (2) what to do about them, that both should have configurable behavior according to an author's situation.

dingari commented 2 years ago

Just wanted to chime in. We've used gsl::span extensively in our embedded application. The contracts save a great deal of headache during development and debugging, but we always disabled them in release builds. It highly affects the amount of optimization possible to the compiler. Prior to 3.0 this was not a problem as the library provided flexibility to the user to control the contract checking behaviour.

For example, the presence of the Expects check in a simple case with a few calls to subspan<>() causes the compiler to produce ~40% more code, that runs almost twice as slow. This is a deal-breaker in a frequently executed ISR context.

Is there anything I can do to circumvent the Expects call without explicitly modifying the source code? Or will I just have to stick with std::span for high-performance code?

johnmcfarlane commented 2 years ago

Hi @dingari. Unless the situation has changed significantly in the past 2 years then yes, you either need to:

stick with GSL 2.x,
switch to std::span, or
pay for bugs you don't write.

If you are able (if your toolchain supports it and the effort involved is not too great), I'd personally recommend std::span. You can still enable implementation-specific instrumentation if you want the bounds-checking behaviour during development. Flags to do this with popular compilers include

-D_LIBCPP_DEBUG=1 in libc++,
-D_GLIBCXX_ASSERTIONS in libstdc++, or
/D_ITERATOR_DEBUG_LEVEL=1 in MSVC

though you may wish to read up on those flags to understand their full implications. They may trap many other bugs during testing, but they won't just add bloat to span!

jonnygrant commented 1 year ago

I've read the informative replies. Thank you @shaneasd for linking to the issue I raised.

As others have said, termination on a contract violation does feel unsafe.

Adding custom handler so that std::terminate() shutsdown gracefully I feel wouldn't work as well as throwing an exception like invalid_argument - so the calling code has a chance to handle it locally to where it occurred. In my experience, when a process shutsdown on a safety critical system, we have a watchdog process that restarts it - of course it may just crash again; because we don't know which module or driver it occurred in the first time. (Code using pointers would be a common cause of such issues.)

I always feel, code must check they pass as parameters are valid. Input validation is mandatory. However, the code that receives bad parameters must not crash if something sneaks through, it should gracefully throw something that can be handled. eg USB module failure on an embedded system (ie bad data, null pointer), we display "USB error" on the display, and disable the USB module until a technician arrives to check the logs. At least the heart rate monitor doesn't crash!

A 3rd party library with such std::terminate() calls can't be used on a safety critical system. It's the same with assert() calling abort() on an !NDEBUG build, or enabling _GLIBCXX_ASSERTIONS I believe. A throw would be useful. Of course an assert core dump gives a more accurate backtrace in a debug build than just leaving it to SEGV.

I was only looking at gsl::not_null, which has a runtime call to std::terminate() instead of a throw logic_error() or such.

Edit:

It feels safer to enable exceptions when constraints are not met on GSL, eg gsl::not_null, as the default behavior. I recall it was ALGOL that introduced the ability to disable runtime NULL checks to boost performance, so it seems logical to still allow disabling of exceptions, asserts, and std::terminate calls if that is what someone wants to do to save a few cycles.