leanprover / lean3

Lean Theorem Prover
http://leanprover.github.io/
Apache License 2.0
2.15k stars 216 forks source link

VM error #1518

Open gebner opened 7 years ago

gebner commented 7 years ago

The following fails with error: vm check failed: i < 2 (possibly due to incorrect axioms, or sorry):

def join (sep : string) : list string → string
| [x]     := x
| []      := ""
| (x::xs) := x ++ sep ++ join xs

namespace toml

inductive value : Type
| str : string → value
| bool : bool → value
| table : list (string × value) → value

namespace value

private def to_string_core : ∀ (n : ℕ), value → string
| _ (value.str s) := "\"" ++ s ++ "\""
| _ (value.bool tt) := "true"
| _ (value.bool ff) := "false"
| (n+1) (value.table cs) :=
  "{" ++ join ", " (do (k, v) ← cs, [k ++ " = " ++ to_string_core n v]) ++ "}"
| 0 _ := "<max recursion depth reached>"

protected def to_string : ∀ (v : value), string
| (table cs) := join "\n" $ do (h, c) ← cs,
  match c with
  | table ds :=
    ["[" ++ h ++ "]\n" ++
     join "" (do (k, v) ← ds,
       [k ++ " = " ++ to_string_core (sizeof v) v ++ "\n"])]
  | _ := ["<error> " ++ to_string_core (sizeof c) c]
  end
| v := to_string_core (sizeof v) v

instance : has_to_string value :=
⟨value.to_string⟩

#eval value.to_string (table [("foo", table [("bar", table [])])])

end value
end toml
leodemoura commented 7 years ago

@dselsam I believe the problem is the sizeof instances of nested inductive datatypes. They use the internal recursors generated by the inductive compiler. Example:

@[irreducible]
protected def _nest_2_2.toml.value._mut_.sizeof : Π (a : psum unit (psum unit unit)), _nest_2_2.toml.value._mut_ a → ℕ :=
λ (a : psum unit (psum unit unit)),
  _nest_2_2.toml.value._mut_.rec (λ (a : string), 1 + sizeof a) (λ (a : bool), 1 + sizeof a)
    (λ (a : _nest_2_2.toml.value._mut_ (psum.inr (psum.inr ()))) (ih : ℕ), 1 + ih)
    (λ (fst : string) (snd : _nest_2_2.toml.value._mut_ (psum.inl ())) (ih : ℕ), 1 + sizeof fst + ih)
    1
    (λ (a : _nest_2_2.toml.value._mut_ (psum.inr (psum.inl ())))
     (a : _nest_2_2.toml.value._mut_ (psum.inr (psum.inr ()))) (ih ih_1 : ℕ), 1 + ih + ih_1)

The case for table : list (string × value) → value will not work correctly because it expects the data to be the "auxiliary list type" introduced by the inductive compiler, the nil/cons for this type uses a different constructor indices (instead of 0 and). So, the code will not work as expected. Recall that in the compilation scheme we have discussed, users are not allowed to use these internal recursors. Here is another way to explain the problem:

I'm not sure how to fix this code. One easy (but weird) fix is to disallow sizeof in compiled code. That is, sizeof is used only for justifying well-founded recursion.

@gebner You can workaround the problem by not using sizeof.

BTW, @dselsam I found another minor problem with sizeof. Instances for subtype, char and string were missing (I fixed this). The VM does not have code for sizeof functions generated by the inductive compiler. Thus, the compiler has to unfold every application. Moreover, some of them are not small. Of course, if we disallow sizeof in compiled code, this is not a problem.

dselsam commented 7 years ago

One easy (but weird) fix is to disallow sizeof in compiled code. That is, sizeof is used only for justifying well-founded recursion.

[Wearing my (on-and-off) developer hat] I like this fix. With well-founded recursion, it will be easy enough for a user to define their own size function.

leodemoura commented 7 years ago

[Wearing my (on-and-off) developer hat] I like this fix. With well-founded recursion, it will be easy enough for a user to define their own size function.

Ok, I will go with this one. @gebner Does it sound reasonable to you?

gebner commented 7 years ago

One easy (but weird) fix is to disallow sizeof in compiled code. @gebner Does it sound reasonable to you?

It certainly fixes the problem for now, and having VM does not have code for toml.value.has_sizeof is a better error message. But it feels very unsatisfactory to have a "computable" function that can't be executed on the VM. For example you can't use #eval to try out sizeof on examples then:

#eval sizeof [1,2,3]

If I understand it correctly, sizeof is the only function that is defined using the hidden underlying recursors, everything else is then implemented using the new recursors or well-founded recursion? I think we should generate VM code for sizeof. It's not urgent, though, and I can insert it somewhere on my todo list.

leodemoura commented 7 years ago

I tried to fix this issue yesterday. The commit above prevents the VM failure from happening. We get the following error message instead:

1518.lean:28:17: error: equation compiler failed to generate bytecode for auxiliary declaration 'toml.value.to_string._match_2'
nested exception message:
failed to generate bytecode, code depends on internal definition 'toml.value.sizeof' which cannot be used in executable code

However, users may still experience the failure if they use the recursors generated by the inductive compiler directly. I cannot mark them as forbidden since the instances generated by the equation compiler are fine since they correctly use the pack/unpack operations. One hackish solution is to have an attribute to mark definitions/constants that cannot be used by users in the front end.

gebner commented 7 years ago

However, users may still experience the failure if they use the recursors generated by the inductive compiler directly. I cannot mark them as forbidden since the instances generated by the equation compiler are fine since they correctly use the pack/unpack operations.

Do you refer to the recursors for the internal unrolled inductive type (i.e., _nest_1_1.toml.value.rec)? Then the problem with the equations compiler should be solved as soon as we use the generated equations for VM compilation, since we'll no longer need recursors on the VM if I understand this correctly (at least for functions defined by the equations compiler).

leodemoura commented 7 years ago

Do you refer to the recursors for the internal unrolled inductive type (i.e., _nest_1_1.toml.value.rec)? Then the problem with the equations compiler should be solved as soon as we use the generated equations for VM compilation, since we'll no longer need recursors on the VM if I understand this correctly (at least for functions defined by the equations compiler).

We would still use the *.cases_on recursors. The new code generator would perform the following steps:

gebner commented 7 years ago

We would still use the *.cases_on recursors.

Yes, but we can use the user-facing cases_on recursor (which is compiled correctly) instead of the internal one. As far as I can tell, we only need to use the internal recursors because *.rec is way too weak to define recursive functions (it doesn't have induction hypotheses for the nested occurrences). This problem doesn't occur with cases_on.

leodemoura commented 7 years ago

Yes, but we can use the user-facing cases_on recursor (which is compiled correctly) instead of the internal one. As far as I can tell, we only need to use the internal recursors because *.rec is way too weak to define recursive functions (it doesn't have induction hypotheses for the nested occurrences). This problem doesn't occur with cases_on.

We would have to change the elim_match.lean module :( It doesn't use the user-facing cases_on, but the nasty internal one. The main issue is that we need to prove the equational lemmas. By using the nasty internal cases_on, we simplify this step. I only had to add support for the pack/unpack lemmas. If we use the user-facing cases_on, we will have to do two things: 1- Make sure elim_match correctly uses them. 2 a - Generate auxiliary equational lemmas for the user-facing cases_on (they do not hold definitionally). @dselsam and I considered this option, but it is a lot of work. OR 2 b - Unfold the user-facing cases_on when proving the equational lemmas and reuse what we already have. I haven't tried this option. I think it is feasible.