`Natural` + friends is bad; `{.requires: cond.}` + friends should be used insetead and checked with `--checks:on`

timotheecour commented 3 years ago

in https://github.com/nim-lang/Nim/commit/8ee0771b5a10cfd083ba2eba806e5d9d9f72b234#commitcomment-38239130 araq argued "return types must not be Natural" (this came up recently here: https://github.com/nim-lang/fusion/pull/18#discussion_r506936370), and instead suggested using {.ensures: cond.} for that.

{.ensures: cond.} for return and {.requires: cond} for params is indeed more flexible than Natural (since you can express any condition), and it doesn't conflate checking with the type system (cf his example in https://github.com/nim-lang/Nim/commit/8ee0771b5a10cfd083ba2eba806e5d9d9f72b234#commitcomment-38254164).

However, ensures, requires is ATM only used by drnim (which incidentally seems broken see https://github.com/nim-lang/Nim/issues/15639), which makes the annotation a whole lot less useful.

proposal

introduce --ensureChecks, --requiresChecks, --invariantChecks (--assumeChecks wouldn't make sense) which would make regular nim test for those
--checks:on would enable those
these would be pushable eg {.push requiresChecks: on.} ... {.pop.}
optional: nim c main by default turns on --ensureChecks and --requiresChecks by default but not --invariantChecks (loop invariant runs on every loop iteration and would slow down considerably)

links

note

yet another example why Natural is usually a bad idea: it affects the type system; whereas .requires doesn't (a transparent abstraction).

proc main(a: openArray[Natural]) = discard # error
main(@[1,2])

Araq commented 3 years ago

So help us make DrNim production ready. Come on...

Araq commented 3 years ago

Too expensive, let's better make DrNim a reality.

timotheecour commented 3 years ago

I think we should re-open this RFC because range (Natural, Positive etc) just isn't good enough in practice and we need something better, and this RFC is one such option. We can discuss its flaws and improve it though.

range as return type causes issues: see https://github.com/nim-lang/Nim/commit/8ee0771b5a10cfd083ba2eba806e5d9d9f72b234#commitcomment-38239130
range as param also causes issues: this was recently demonstrated in https://github.com/nim-lang/Nim/pull/17149#issuecomment-783975479 because it changes sigmatch (a Natural matches less well than cint but a int matches better than cint), eg:
```
proc fn(a: cint): int = 1
when defined case1:
proc fn(a: Natural): int = 2
else:
proc fn(a: int): int = 2
echo fn(1)
```
```
nim r main # 2
nim r -d:case1 main # 1
```
range isn't flexible enough, you can't express more general constraints such as: x: int but non-zero
range is super buggy. The root cause is it changes the type system instead of simply causing some checks

Too expensive, let's better make DrNim a reality.

I don't see why, you'd be able to disable such checks like all the other checks, via --ensureChecks:off or globally with --checks:off.

eg, for os.sleep PR https://github.com/nim-lang/Nim/pull/17149, it would be simply:

proc sleep(t: int) {.requires t>=0.}

=> self documenting, and doesn't mess with the type system nor sigmatch

Araq commented 3 years ago

The major problem with this proposal is that it seems to do these checks at runtime. I'm nowadays quite convinced that these checks should be proven correct by the compiler but should not cause changes in runtime behaviour. Otherwise you end up with new failure modes that others expect to be catchable in practice ("omg, don't make my server crash")

timotheecour commented 3 years ago

The major problem with this proposal is that it seems to do these checks at runtime

take delete for example:

delete has (or should have after fixing https://github.com/nim-lang/Nim/issues/16544) a visible failure mode (i<0 raises because of Natural) and a hidden failure mode (it should raise inside the proc body when i>=x.len):

proc delete*[T](x: var seq[T], i: Natural)

with this RFC, you'd instead only have a visible failure mode, no hidden failure mode:

proc delete*[T](x: var seq[T], i: int) {.requires: i >= 0 and i < x.len.}

In practice, you often cannot prove at CT that the condition i >= 0 and i < x.len is met, even with whole program analysis, so if you only restrict yourself to CT-provable conditions, you'd be forced to:

either omit the requires condition (bad for self documentation, bad for further static analysis)

or for user to write redundant condition in preceding code, eg:

proc fun(s: var seq[int], i: int) =
assert i >= 0 and i < s.len # redundant code, but needed to make `requires` computable at CT
delete(s, i)

Note that this proposal still allows some checks to be optimized away at RT if compiler can determine that (based on other checks, assert's, range checks etc) the condition holds.

There are 3 states:

definitely fails condition: CT error (or at least warning + RT check)
definitely succeeds condition: RT check is a noop
cannot prove/disprove the condition succeeds: emit RT check

And when checks are disabled, this analysis isn't performed.

Benefits:

replace Natural, Positive etc with something that doesn't change the type system/sigmatch and is more flexible
still allow eliding some RT checks when compiler is built with z3/other static analysis
still useful from day 1 without having to wait for a perfect drnim; as compiler gets smarter, more RT checks can be turned into CT checks, but the checks are still in place
more conditions can then become explicit in proc declaration (instead of hidden in proc body which cannot be exploited) but don't affect the type system nor sigmatch rules

note

even when a RT check can't be omitted, it's still useful for further static analysis, eg:

proc delete*[T](x: var seq[T], i: int) {.requires: i >= 0 and i < x.len.}

proc fun(s: var seq[int], i: int) =
  delete(s, i) # RT check: i >= 0 and i < s.len
  # now the compiler can in theory simplify the next check as:
  delete(s, i+1) # RT check: i+1 < s.len
  # or in other cases, determine that a check would be guaranteed to succeed based on previous requries/assert's

Araq commented 3 years ago

or for user to write redundant condition in preceding code, eg:

That's what I propose yes. It would be mitigated by the fact that an illformed .requires clause only causes a warning, not an error. Code should be written in a style that is easy to prove correct, that is what type systems are made for. The goal is not to use ever more complex proof engines with unkown compiletime performance characteristics. There should be a subset of Nim involving let variables, min, max comparisons, + and - that is well defined and the foundation for .requires and .ensures.

ringabout commented 3 years ago

juancarlospaco commented 3 years ago

Can be made a lightweight DrNim, that does not depend on the Z3 thingy?, one that at least checks for positive and natural numbers is more than enough, maybe add a -d:nimZ3 to DrNim and allow a simplified logic without dependencies, should be possible to check if a number is not negative using Nim only.

I am NOT saying to trash the Z3, just make the thing work, so people use it and contribute back.

Otherwise is not a real alternative to Positive and Natural.

timotheecour commented 3 years ago

yet another example showing Natural is bad in API's:

proc main(a: openArray[Natural]) = discard # error
main(@[1,2])

(refs https://github.com/nim-lang/Nim/pull/15790#issuecomment-787118705) and would be best handled with .requries.

That's what I propose yes

@Araq I don't see how this would ever work. With this suggestion, if you have N clients of delete(a: JsonNode, b: seq[int]), you'd need potentially each client to (defensively) verify the pre-condition in the calling code, eg:

# with this RFC:
delete(a, getIndexes()) # the checks are done inside with `delete(a: JsonNode, b: seq[int]) {.requires: cond.}`

# with your proposal:
when compileOption("assertions"):
  let b = getIndexes()
  for bi in b: assert bi > 0
  delete(a, b)
else:
  delete(a, getIndexes())

this is bad for many reasons:

no-one will write code like that, and it's potentially less efficient
or if they do, they'll likely be too defensive (checking things that aren't needed to be) or not defensive enough

User code (and in particular users of API's) has less context than the compiler, the compiler instead should be the one deciding to optimize out tests, but that optimization is not even needed by this RFC and can come later gradually; all that's needed is to honor the .requires checks when --requiresChecks is set. It's arguably simpler, better and more likely to be used.

A key advantage being it doesn't affect the type system, unlike Natural + friends.

juancarlospaco commented 3 years ago

proc main(a: openArray[Natural]) = discard # What error ?
main(@[1.Natural, 2.Natural])

Even with the wrong code, the error exists BEFORE entering the body of the function.

Imagine DrNim works out-of-the-box, Natural and Positive are Deprecated and Removed.

If the input data is unknown, and only exists at run-time, how does DrNim checks that ?, because at least Natural and Positive does, I know you suppose to use {.requires: data > 0.}, but if that is not meet at run-time, you still need to handle that (?), you end up with a ton of doAssert data > 0, is Natural all over again, self-contradicting.

How well does DrNim runs on Embedded hardware?, Natural and Positive just fine, I know you can DrNim it on x86_64 then deploy, but that changes stuff like float support, arch, etc.

(I am asking, not confirming)

juancarlospaco commented 3 years ago

using {.ensures: cond.}

{.ensures: cond.} for return and {.requires: cond} for params

ensures, requires

introduce --ensureChecks, --requiresChecks, --invariantChecks (--assumeChecks )

turns on --ensureChecks and --requiresChecks

whereas .requires doesn't

Literally DBC.

See Ada, used for serious stuff.

We need a DBC DSL on stdlib, can be used with and without DrNim, DrNim "reads" the DSL to get the requires and ensures, Non-DrNim mode, generates run-time assertions, when -d:danger no assertions, DrNim can read DBC DSL anyway with or without -d:danger.

Imagine this is wrong, then Ada/Spark is used as toy programming language for thrown away scripts and not for serious purposes. :shrug:

Araq commented 3 years ago

Imagine DrNim works out-of-the-box, Natural and Positive are Deprecated and Removed.

Well, encoding invariants in types is still very useful. So in an ideal world, the definitions become:


type
  Natural = int {.invariant: value >= 0.}
  Positive = int {.invariant: value > 0.}

timotheecour commented 3 years ago

here's another motivational example, see https://github.com/nim-lang/Nim/pull/17625/files?diff=split&w=1#r606847437

before this RFC:

proc getPointer*(x: Any): pointer =
  ## Retrieves the pointer value out of `x`. `x` needs to be of kind
  ## `akString`, `akCString`, `akProc`, `akRef`, `akPtr`,
  ## `akPointer` or `akSequence`.
  assert x.rawType.kind in pointerLike
  result = cast[ppointer](x.value)[]

after this RFC:

proc getPointer*(x: Any): pointer {.requires: x.rawType.kind in pointerLike.} =
  ## Retrieves the pointer value out of `x`.
  result = cast[ppointer](x.value)[]

=> self documenting, DRY, and helps static analyzers

examples like this are pervasive.

(note that now that https://github.com/nim-lang/Nim/pull/17054 was merged such pragmas would be rendered in docs)

juancarlospaco commented 3 years ago

The problem is not the pragma, the problem is how to prove them valid ?.

timotheecour commented 3 years ago

with this RFC:

these {.requires: x.} turn into runtime checks when --checks:on is specified
likewise with {.ensures: x.} when the function returns
a static analyzer (drnim, nim compiler or other) uses those conditions to generate a CT warning or error when it can prove that a condition will not be satisfied, even with --checks:off
with --checks:on, it can elide a check when it can prove that such check will be satisfied

note that the runtime checks can be performed without any static analyzer implemented; the static analyzer can improve over time to detect more errors / elide more checks gradually.

Araq commented 3 years ago

I think it's too early to look into dynamic checks when we're that close to working static checks.

konsumlamm commented 3 years ago

I think it's too early to look into dynamic checks when we're that close to working static checks.

I don't see why we can't do both? The dynamic checks would only be enabled when --checks:on is enabled and the conditions can't be statically proven. It doesn't look like DrNim will be ready any time soon and not everything can be proven by a static analyzer, so dynamic checks are still useful, even with DrNim.

Araq commented 3 years ago

No, the analysis is off: If it cannot be checked statically, there should be a mechanism like:


template enforce(x) = 
  {.assume: x.}
  assert(x)

To turn what is beyond the compiler's capabilities into runtime checks explicitly.

nim-lang / RFCs