Better exceptions for invalid ranges

mratsim commented 5 years ago

Showcased in https://github.com/status-im/nim-beacon-chain/pull/284, Stint exceptions should be better to improve unittesting in Eth2.

Currently for Overflow/Underflow it throws RangeError or AssertionError in the overflow tests of beacon chain:

Since those tests does not involve computation we should probably use RangeError for all (and keep OverflowError for computation).

arnetheduck commented 5 years ago

well, this is a prime example of why it's impossible to divide the world into catchable or uncatchable exceptions in general - an overflow when parsing a stint should naturally lead to an OverflowError, which by default is considered uncatchable.. so then you need to introduce CatchableOverflowError or something. doable, but oh so ugly. ping @zah - the core issue here is that a library author is not in a position to predict what the user of the library will consider uncatchable ahead of time - likewise the idea of what is catchable and not tends to change over time trending towards catchable..

zah commented 5 years ago

I think the distinction between catchable errors and defects is quite valuable. I'm not sure if that's what you mean when you argue against it, but if we abandon it because it's not perfect, this would really be a step backwards.

It would be more accurate to say that most of the time the distinction is quite well-pronounced and there are only few occasions where we have a slightly grayer area. And even this "prime example" is not all that gray imo.

An integer library should specify what happens in case of overflow. There are few possible choices:

Use saturating semantics
Use wrap-around
Use an infecting NaN value
Treat it as a defect (the user should have written the code with overflow checks, so she didn't follow the contract of the library and that's a bug)
Treat it as a recoverable error (the computation is interrupted and the user code is notified)

A rich library will allow you to choose one of these options. This can be done several in ways:

You can encode the behavior in the type that you use (e.g. have a SaturatingInt type)
You can encode it in the operation that you choose to use (e.g. have a saturatedPlus op)
You can encode it thread-local flags (e.g. have a function such as _controlfp_s that specifies what happens in case of overflow)

When the library offers just one of the options, you can either follow the contract being offered or you can extend the library by adding the extra options.

Errors such as IndexError are similar in a sense. The contract of the standard library is that you should always ensure that the index is correct before using it. A code written without such a check is defined to be "buggy" and has to be fixed. The standard library also offers a number of alternative operations that are safe to use with invalid indices.

zah commented 5 years ago

If I have to distill it to a single paragraph, it goes like this:

A library specifies a contract that you must follow. If you violate the contact, that's a defect. The contract may suggest some situations that you should recover from (that's the errors). If you don't like the contract, you can ask the author of the library to modify it. Some contracts are not perfect because they are too strict and require too much work from the user, other contracts are too loose and ask you to recover from situations that are unreasonable (then the user has to create a wrapper library to make the code more practical).

arnetheduck commented 5 years ago

which of those options should be encouraged though? which leads to the most favourable situation for the eco-system as a whole? generally, people will use whatever can be copy-pasted and conveniently accessed from the language first, then look to the other options. laziness first.

The argument makes sense for a single library from a single developer, but in aggregation the story is a bit different. Tt's up to the language and std lib to provide these convenient defaults, so that you do the right thing cheaply without thinking about it. An eco-system of libraries is much more viable at 80% "perfect" libraries than at 50%.

It is also inherently easier to construct a weaker api from a stronger one - for example, it's trivial to build a defective api from a infective one because you only need to look at the outcome - the other way around is tendentially more difficult as you must understand the inputs instead.

arnetheduck commented 5 years ago

Another way to put this is that the options are not equal, and neither is the relationship between library author and user - there's an asymmetry here that amplifies the benefits a good api - it's usually costly to enact change in a poor library.

zah commented 5 years ago

In the eco-system as a whole, the good stuff rises to the top. Eventually, libraries appear that have picked these right defaults and they get famous. Yes, there will be a lot of crap created along the way, but that's life.

Remember, here we are arguing whether the errors vs defects distinction has value at all. I think it does, because without it, the end game is not the same. The "good stuff" that will raise to the top will be slightly different and not as good as it could be.

mratsim commented 5 years ago

Well the main issue is that the starting point is not the same, bad defaults are already in the standard library and might inspire further badness see https://github.com/nim-lang/RFCs/issues/32 (crypto API).

status-im / nim-stint

Better exceptions for invalid ranges #79