Closed stepancheg closed 3 years ago
How does idris_crash_total
guarantee program termination (in the sense of the totality checker)? I can just call idris_crash_total 12 "oh no"
and it will crash at runtime. I therefore find the name to be misleading.
Also, the claim that this pattern appears 10 times in the src folder of Idris2, while being correct, does not mean it is "used quite often": These 10 occurrences all are in src/Compiler/RefC/RefC.idr
, so they where probably introduced by the same contributor (or just copy pasted by other contributers) in order not to use a more accurate return type like Maybe
or Core
. This looks like a code smell and I'd suggest that rather these calls to idris_crash
be fixed instead of adding a new crashing function to the prelude.
Therefore, I suggest the following: Unless we come up with a proper and common use case for this (the one given in the original post is rather contrived), we should refrain from adding such a function to the prelude.
How does idris_crash_total guarantee program termination (in the sense of the totality checker)?
I don't really know how totality checker works, I just couldn't find an example where:
idris_crash_total
would allow non-total function as total1 = 0
Unless we come up with a proper and common use case for this
The use case is simple: allow writing programs with %default total
while avoiding unsafe assert_total
functions and allow runtime assertions for impossible but unproven to be impossible cases.
suggest that rather these calls to idris_crash be fixed
This is not always trivial to do.
Consider this example: there's a function:
negate : Int -> Int
One might want to write a runtime assertion that argument is not equal to MIN_INT
(because in this case, negation results in overflow).
Alternatives to runtime assertions are:
MIN_INT
— which is very hard to doMaybe
— and the result of a lot of noise in error handling which practically does make program safer or reliable or whateverSo practically the best thing to do in the implementation of negate
is runtime assertion.
I don't really know how totality checker works, I just couldn't find an example where:
A function that might throw an exception at runtime is per definition a "partial function". Marking such a function as being "total" is just plain wrong. The whole point of totality checking is a compile time guarantee that the function in question returns a value of the given type in a finite amount of time.
If, however, you have a branch in a pattern match that you know can never be reached (but the compiler does not have the means to verify), an explicit call to assert_total
is the way to go.
The use case is simple: allow writing programs with
%default total
while avoiding unsafeassert_total
functions and allow runtime assertions for impossible but unproven to be impossible cases.
How is idris_crash_total
safe in contrast to assert_total $ idris_crash
, given that under the hood the two functions behave exactly the same? In addition, could you please give a non-contrived example of this use case that cannot be implemented with a call to assert_total $ idris_crash
?
This is not always trivial to do.
In the examples in Compiler.RefC.RefC.idr
it is trivial: Change the functions' return type to Core something
and throw an exception via function throw
. This is a total function as the Core
monad is guaranteed to properly handle these exceptions at runtime. Therefore, the claim that this is a common use case that even appears in the compiler itself is not valid.
Consider this example: there's a function:
negate : Int -> Int
One might want to write a runtime assertion that argument is not equal to
MIN_INT
(because in this case, negation results in overflow).
An overflowing integer calculation is not a "partial" function. There is no need for a call to assert_total
in such a case. If you want to make sure that the given argument has been properly validated to adhere to some precondition, the following signature will do:
negate : (v : Int) -> {auto 0 prf : v > MIN_INT = True} -> Int
No noise in further function calls, with a guarantee that the proof is being erased at runtime.
So practically the best thing to do in the implementation of
negate
is runtime assertion.
No. The best thing is to write negate
in such a way that it enforces a runtime or compile-time in-bounds check being performed before invoking the function. This is what the signature above does.
And once again: If the function is implemented in such a way that it performs a runtime assertion, it is a partial function an must not be annotated as being total!
A function that might throw an exception at runtime is per definition a "partial function".
OK, technically you are correct, I used the word "total" incorrectly. But have a look at this function:
total
add : String -> String -> String
add x y = x ++ y
It is total right? But technically it is not: memory allocation may fail. And making ++
return Maybe String
would be very impractical (and still won't guarantee that program won't fail because of overcommitted memory for example). But we are fine with declaring this function "total" for our practical purpose.
How is idris_crash_total safe in contrast to assert_total $ idris_crash, given that under the hood the two functions behave exactly the same?
They are not exactly the same: idris_crash_total
accepts extra argument.
assert_total $ idris_crash
allows proving that 1 == 0
:
one_eq_zero : S Z = Z
one_eq_zero = assert_total $ idris_crash "clowntown"
It is not possible to do this hack with idris_crash_total
.
negate : (v : Int) -> {auto 0 prf : v > MIN_INT = True} -> Int
OK, it might be relatively easy in this case, but consider this example:
howManyBitsInBytes : (bytes : Int) -> {prf : bytes ?condition?is?not?trivial?here?} -> Int
howManyBitsInBytes bytes : checkedMul bytes 8 -- crash on integer overflow
With more complicated functions (like computeMemorySizeNeededToSerializeAStructure
) writing such proof is practically impossible (let alone unreasonable waste of engineer time).
And once again: If the function is implemented in such a way that it performs a runtime assertion, it is a partial function an must not be annotated as being total!
Let me put it this way. total
modifiers exists to make formal proof of program correctness. Like asserting that sort
function returns a sorted list.
Strict following "totality" definition will hurt practical applications of Idris: when writing real world programs it is often more convenient to just place runtime assertion than supply proper proof or do sophisticated error handling. And it won't gain anything.
And Idris focus (as I understand it) is on being practical language, not theorem prover.
The way I see it, users do %default total
, occasionally prove something in the program, and insert runtime assertion when proof is not possible or not needed, or just for simple invariant checks. Without dangers of using assert_total $ idris_crash
.
Thanks for the detailed explanations! There are many valid points in there and it is now clearer where you are headed to.
I'd still prefer not to have idris_crash_total
in the prelude for the following reasons:
assert_total
, so personally I think the minimal gain in type-safety by introducing another crashing function is just not worth it.idris_crash_total
might be of no use to them. But according to the following quote from Prelude.idr
it then should probably be in base or rather contrib: "A rule of thumb is that it (the prelude) should contain the basic functions required by almost any non-trivial program."Now, the bullet points above mostly reflect my personal opinion. Other Idris2 users might probably want to chime here in order to figure out whether a majority of the community has a need for a function like idris_crash_total
in order to justify its addition to the prelude.
Edit: Finally, why not just annotate the functions from your examples as being partial
because that's what they actually are?
It is actually not total, so its name is misleading
We can call it idris_crash_total_ish
.
Although, I'd probably prefer to shuffle it as:
idris_crash x message
to be a function which is declared total
— same as proposed idris_crash_total_ish
idris_crash_partial
— current idris_crash
why not just annotate the functions from your examples as being partial because that's what they actually are?
Because one small partial function poisons the whole program. Practically %default total
won't work.
most Idris2 programmers are not dealing with the kind of low level data manipulations you describe in your examples
I must say, most programmer don't bother with the fact that integer addition or multiplication may overflow. Which may lead to very interesting problems up to security violations and data corruption. I actually think that integer multiplication should be checked multiplication by default (i. e. crash on overflow). But still considered total.
Because one small partial function poisons the whole program. Practically
%default total
won't work.
Isn't that the whole point of having a totality checker: It allows you to keep track of partial and total functions? If you have a partial function in your program that cannot be converted to a total one, the yes, all functions calling this partial function should have the partial
taint, in my opinion.
If a module contains a mixture of partial
and total
functions, it is no shame to annotate the partial functions as such explicitly, which still allows you to keep %default total
at the top of the module.
I just don't think what you describe here should be considered a best practice in Idris2 programming and as such should not be encouraged by adding a corresponding partial function that's marked as being total to the prelude.
I must say, most programmer don't bother with the fact that integer addition or multiplication may overflow. Which may lead to very interesting problems up to security violations and data corruption. I actually think that integer multiplication should be checked multiplication by default (i. e. crash on overflow). But still considered total.
But is this true for the integer multiplication and addition we have in Idris2? They should be implemented by the backends in such a way as that they can't overflow. If, on the other hand you invoke some fast low level FFI primitives, which might show the data corruption you describe if unchecked, by all means, do check them, but don't call them total
unless you know exactly what you are doing (in which case you might just as well invoke assert_total
).
Isn't that the whole point of having a totality checker: It allows you to keep track of partial and total functions?
My understanding is no, the main purpose of totality check exists not to verify strict theoretical definition of totality, but for proving the invariants. Like when data is serialized to JSON, serialization library can assert (at compile time) it does not forget to place a matching closing brace. Note just covering is not enough, because covering functions allow you to prove anything:
covering
one_eq_zero : Z = S Z
one_eq_zero = one_eq_zero
That's why %default total
is useful.
If a module contains a mixture of partial and total functions, it is no shame to annotate the partial functions as such explicitly, which still allows you to keep %default total at the top of the module.
I tried to do that once (write a sufficiently large program without assert_total
and without idris_crash
). As I result I had to reshuffle covering/total modifiers several times when I developed that program. Like you do things total, then you introduce a small function which is not strictly total, and then you have to change each function signature from total to covering, then you fix it and rewrite everything back to total.
I would be OK with the redefinition of total
modifier as "program is total, it allows you to prove invariants, but it may fail at runtime when a user requested so".
And another note. I think it is probably OK to consider a crash a possible function output. Like each function result is implicitly logically declared as Either Crash Result
. That way idris_crash_total
will be strictly total (but current idris_crash
will not).
They should be implemented by the backends in such a way as that they can't overflow.
There's Int
type and Integer
type. Integer
has infinite precision and does not overflow, Int
is finite and obviously it can overflow. But Int
is faster so it is practically useful.
There's
Int
type andInteger
type.Integer
has infinite precision and does not overflow,Int
is finite and obviously it can overflow. ButInt
is faster so it is practically useful.
Sure, but Int
overflows according to its specification without crashing, so the corresponding functions can still be considered total.
I think we are stuck at this point: While I agree that the use cases you describe are all valid, I don't think they merit adding a new function to the prelude as they can be solved with the existing idris_crash
and assert_total
. You opinion is obviously a different one, which is perfectly fine. Let's see what others think of this.
Thanks, for the constructive discussion.
For me the outcome of this discussion is that it might be reasonable to consider a crash a valid total function output: each function is implicitly considered returning Either RuntimeCrash Result
. If we define it this way
There are multiple problems with this proposal:
assert_total $ idris_crash …
is an antipattern we don't want people to write this in the first placeidris_crash_total
is not total as mentioned by @stefan-hoeck total
keyword is to indeed prove properties about programs but all of this goes out the window if you can't confidently run the programs. (This is why assert_total $ idris_crash
is an anti-pattern in the first place)assert_total $ idris_crash …
is because they cannot return a value, requiring idris_crash_total
to return a value would mean that at every use-site, one could simply use the value rather than crash, completely defeating the purpose of crashing in the first place.At this point of the conversation I think it is useful to take a bird-eye view and extract the chief complaints:
assert_total $ idris_crash
to destroy our proofs/programsThat suggests to me multiple problems (none of which idris_crash_total
fixes):
partial
and total
even mean to begin with1 and 2 are very closely related, the issues raised about memory management and Int
manipulation are legitimate issues for which we provide no answer: It would be nice to expose those challenges in the type system such that we can predict and document edge-cases and runtime behaviour. However, given the current technology I do not believe we have a solution that isn't unergonomic and misleading.
As for 3, this might be solved by improving the ergonomics of the IDE Mode so that we have automated refactoring tools and better flow analysis. In the long term we could maybe add partial
and total
annotations to the type system (as multiplicities/grades) and make those annotation polymorphic such that we don't have to update any programs when things move around, but we're still far from this.
edit: @stepancheg you answered something quite interesting while I was writing this and that's we can consider a crash total if it has this signature Either RuntimeCrash Result
. We already do this using Void
, a function that crashes would be of type Void
(or be isomorphic to it) and so a crashing function can be represented with Either Void Result
. In general, functions cannot be implicitly treated as Either Crash Result
because your Result
might also be a Either Crash Result
itself and that would be completely unhelpful (a-la unchecked exceptions, or a-la null
vs Option
in Java)
@andrevidela I think I didn't explain my idea properly. Let me clarify a couple points:
*Either Crash Result
is a mental/logical model, not a syntax
Crash
typeVoid
. Crash means function terminates returning "crash" (logically).
idris_crash_total
is not total as mentioned by @stefan-hoeck
It is total with proposed model.
Let me preface this by saying that we're straying further and further from the original topic, which suggests to me that this idea require more thought that what was originally presented. That also means that the solution presented is unlikely to resolve the issue raised, unless I missed something crucial.
As for your clarification I think you need to remember that from a type-theoretic perspective there is no difference between crash and Void
they are both uninhabited, from which you can derive any absurd conclusion Void -> a
. Which is why idris_crash
returns a
, it's a shorthand for the following program:
crash : Void
crash = ?crash_the_program
absurd : Void -> a
absurd _ impossible
idris_crash : String -> a
idris_crash message = trace message $ absurd crash
So yes there is a crash type, you can declare it, it's Void
. Not every function can crash, inverting a boolean for example won't ever crash (specially if it's declared linearly but currently the interplay between linear and non-linear programs is still a WIP) but most primitive types have undocumented runtime behaviours that we cannot expose to the type system at the moment.
It is total with proposed model.
The proposed model is incompatible with the goal of the totality checker: to provide confidence in the programs written, both at compile-time and runtime. Anything that strays from this goal is either a bug or an artefact of the current technology. In this case it's the latter: We don't have any way of representing the assumptions about memory usage of primitive types and functions in the type system.
Unless the concerns I mentioned in my previous post and here are addressed I remain skeptical of the benefit of this particular proposal. This discussion does however point toward other areas of the compiler that could be improved. Typically the current exit
function returns IO ()
but in reality it could return Void
or forall a. a
safely.
Let me preface this by saying that we're straying further and further from the original topic, which suggests to me that this idea require more thought that what was originally presented.
Yep.
from a type-theoretic perspective there is no difference between crash and Void
I'm definitely not very well versed in type theory, but we don't have to think about crash that way.
Void
is a function which basically cannot be invoked at runtime, because it does not haltIn that sense declaring idris_crash
to return Void
is incorrect, because you can prove anything from Void
. And consequently returning IO ()
from exit
is correct.
The proposed model is incompatible with the goal of the totality checker: to provide confidence in the programs written, both at compile-time and runtime
I don't understand. The goal of totality checker is to be able to implement invariant checker (e. g. all integers in a set remain valid indices in an array while we transforming the set or the array). Runtime crashes do not prevent that feature of Idris. Can you please elaborate?
For the sake of argument, assume that idris_crash
returns its own error type Crash
, which is different from Void
. As you suggest, every Idris2 type a
is then implicitly isomorphic to Either Crash a
. If you adhere (even implicitly) to this model, you must be able to define the following function:
fromCrash : Crash -> a
fromCrash = Left
From this, you can immediately derive
ohno : Crash -> 1 = 0
ohno = fromCrash
since 1 = 0
is just another Idris2 type, which is - by your proposal - inhabited also by Crash
. Hence, from Crash
like from Void
you can prove anything.
Void is a function which basically cannot be invoked at runtime, because it does not halt
This might be where the confusion comes from, Void
is nothing but an uninhabited type. Crash
and Void
both would have the same definition:
data Void : Type where
--blank
data Crash : Type where
-- blank
A function that uses void as argument cannot be called at runtime because there are no values to give it. I cannot be called at runtime because there are no values to give it. Hence why idris lets you define
absurd : Void -> a
absurd _ impossible
because we're pattern matching on a value that does not exist
crash is something which always return in finite time for all arguments, and returns a special value, which is crash, and crash is implicitly propagated to the program entry
This is true, but the propagation is untracable. From the typechecker's perspective, once exit
or idris_crash
is reached, all bets are off and the rest of the program is bogus. Modelling recoverable errors would be done in a way that the typechecker can check, and that's when error monads, exception monads, Either, etc come in.
In other words:
exit
or idris_crash
was possible, the type would be IO ()
or something similarVoid
or a
As far as I know we're in the second situation with Idris.
The goal of totality checker is to be able to implement invariant checker (e. g. all integers in a set remain valid indices in an array while we transforming the set or the array). Runtime crashes do not prevent that feature of Idris. Can you please elaborate?
I can, Haskell is a good example of what you describe "well typed program do not go wrong", and indeed they do not, except for runtime errors. Adding a totality checker allows to remove a large class of runtime errors taht the typechecker does not catch (additionally they ensure that running function at compile type won't crash your compiler). From this I hope you can infer that the goal of the totality checker is to provide some confidence that your program won't behave in a way that is not advertised by the type. Adding a function that allows you to subvert this expectation is conterproductive.
The totality checker cannot be perfect (beucase of the halting problem), which is why we have things like assert_total
. Those are technological artefact for which we do not have a solution (same goes for primitive functions that are annotated with assert_total
), idris_crash_total
is not a consequence of technological limiatation inherent to computing and therefore does not warrant subverting those expecatation like assert_total
does.
sorry for the typos, I"m on the go, hope I didn't make any obvious mistakes
I don’t have much to contribute to the debate, but I thought it was an interesting footnote that the language Zig is currently tackling the space of things impossible to know until runtime (like the fact that a computer running the program might literally run out of disk space).
I’m not sure there’s a lot in Zig for Idris to borrow, but I find it interesting that there’s an example of an imperative language in many ways different than Idris that is also seeking to remove indeterminate program behavior. The approach taken by Zig is very much to make the program explicitly face errors instead of implicitly accept crashes — in other words, using a function that might fail at runtime introduces a possible error whether you handle it immediately or propagate it to other parts of the code.
I think the thing I find most useful about Zig existing is that it specifically pegs itself against Rust, which is a common example used for comparison when talking about safety at a low level of programming. Zigs existence shows that there is a not insignificant number of folks who like Rust but wish it was safer still. They wish it removed more crashes from existence and it feels appropriate to me that at least some Idris developers have that same appetite.
So, to end the tangent, I tend toward wanting to make as much of what we can’t know until runtime explicit in the program and as much of what we can know at compile time provably correct. Personal preference, but really what draws me to Idris.
@stefan-hoeck yes, you are correct Either Result Crash
analogy is wrong.
Perhaps more correct analogy might be this: each function returns a pair (Result, Maybe Crash)
.
@andrevidela
From the typechecker's perspective, once exit or idris_crash is reached, all bets are off and the rest of the program is bogus.
No idris_crash_total
signature is:
idris_crash_total : x -> String -> x
And from typechecker perspective it is indistinguishable from a function:
idris_crash_total v message = v
From this I hope you can infer that the goal of the totality checker is to provide some confidence that your program won't behave in a way that is not advertised by the type.
What type of errors do you think totality checker is meant to prevent?
My understanding is that totality checker itself is useless, and it is needed to implement dependent types, which are needed to prove invariants like ones I described above (e. g. inserting an item into a sorted list keeps the list sorted).
Id v = v
Is not the same as crash v = idris_crash
I don't have a repl handy but I'm fairly confident check : id = crash
would fail.
What type of errors do you think totality checker is meant to prevent?
head : List a -> a
head (x :: xs) = x
head [] = crash
finally id like to mention that you don't need a totality checker to have dependent types nor enforce invariants:
head : List a -> Maybe a
is documenting an invariant albeit not very clearly Edit: I forgot to mention something because it is so obvious that is escaped me but totality checking is also important to write proofs which become quite useless if they themselves can crash.
Id v = v Is not the same as crash v = idris_crash I don't have a repl handy but I'm fairly confident check : id = crash would fail.
Sorry, I didn't get that. What would fail? I could check. If there's a flaw in what I proposed, I would like to know.
head [] = crash
First, this is not full totality checker, it is just "covering" checker.
Here crash
use is unreasonable (Maybe a
is appropriate type), you cannot develop a practical language which will prevent shooting own foot when people want to. There's a million ways to do that. For example, returning just some value of appropriate type is a common pattern. It is not possible in head
function you described, but it is possible in the majority of functions (for example tail [] = []
— crash would be better than this).
Let me show an example. Consider the case of multiplication of integers. There are three options:
Error handling is not always acceptable: too much work, unergonomic API for the situation which will likely never occur.
So if you deny people to crash at runtime, they will just accept it will overflow to negative, and the bug will be much harder to debug than if we did runtime assertion. But the totality checker will be happy.
By the way, Rust did very well on error handling: it promotes using Result
/Option
types to return errors, and it's quite convenient, but makes crashes easy: there's assert!
macro, there's .unwrap()
shortcut. Because practically in the real world it is not possible to propagate all errors. For that particular case, checked multiplication can be done with a.checked_mul(b).unwrap()
.
scala, cpp templates
I don't understand what this is about. Although I think I know Scala or C++ relatively well.
You can enforce invariants with refinement types or with static analysis tools
If there's a tool which does it well, I would be happy to use it and abandon Idris. But so far Idris is doing better than others options I'm aware of.
totality checking is also important to write proofs
Yes, exactly.
which become quite useless if they themselves can crash
If proof crashes, then it is not a proof. idris_crash_total
obviously cannot be used for code evaluated at compile time. But it does not prevent proofs.
Sorry, I didn't get that. What would fail?
The typechecking would fail because you can't construct a proof that the two functions do the same thing:
id : a -> a
id v = v
crash : Void
crash = ?endOfProgram
fail : a -> String -> a
fail _ message = trace message $ absurd crash
test : Main.id True = Main.fail True "fail"
test = Refl -- can't unify `True` and `trace message $ absurd crash`
If this were to be allowed you would break functional extensionality which is a very desirable property of programs that we cannot prove so we simply have to try our best to ensure that two functions that advertise do the same thing actually do the same thing. Again this is a limitation of 21st century computer science, we can't express that in the type but that's something we want.
This is fine:
id : a -> a
id v = v
id' : a -> a
id' v = v
test2 : Main.id' True = Main.id' True
test2 = Refl
I'm not quite sure what the rest of your post is supposed to address in the current discussion. They seem to be inline with what we've been talking about, if I remember the Rust error model correctly they also use the Void
type for fatal errors, Swift does the same with Never
. If you have other questions I'm happy to answer
The typechecking would fail because you can't construct a proof that the two functions do the same thing:
OK, I think I left some misleading comments earlier.
idris_crash_total
function does not need to be public
. It should not behave exactly like id
. From typechecker point of view it is just a total opaque function (like cast
).
the Rust error model correctly they also use the Void type for fatal errors, Swift does the same with Never
They don't have totality, they don't do proofs, so they don't have problems with these types.
One observation about assert_total $ idris_crash
- they're all in the RefC code generator. This should be refactored so they're catchable errors, not crashes. (Ah, I see that's been noted above too.)
A function which can crash is inherently not total. Everything about the runtime, and everything about totality checking, is under the assumption that memory is available, there's no stack overflows, nobody switches off the computer, etc. idris_crash
is really there as a substitute for Haskell's error
except that it isn't recoverable via an IO exception, and gives a programmer a chance to give a more useful message when a partial function fails.
So my conclusion is that this doesn't belong in the library.
I haven’t read the conversation in full, but I’ll still add: any total program that needs an ability to crash, i.e. result in a trivial value (Unit) can be given such privilege, roughly speaking, by simply wrapping its type in a Maybe. That has a practical benefit of imprinting the information about possible failures into the type of the program itself. And any programmer, just by looking at the type, can understand and assume a lot about semantics of the program & have those assumptions backed up by the type system.
So if I see a program of type Maybe a
, I can be sure that running it will result in a value of type a
or “crash” ( = return a trivial value) in finite time
I think everything that needed to be said has been said. Therefore, if that's fine with everyone I'm going to close this to avoid encouraging more circular replies.
Add
idris_crash_total
function.The idea is this:
idris_crash
functions is not total, so every timeidris_crash
is used in total function, it need to be annotated withassert_total
. Which is not safe, because it accidentally make non-total function total.Consider this example:
This is obviously incorrect, but this pattern —
assert_total $ idris_crash
is used quite often, for example, 10 times insrc
folder of Idris2.The proposed solution is to add a function:
So the example above won't work with this function:
So
idris_crash_total
is safe: