Closed ccshan closed 2 years ago
I think it's because String_inst0
never has its Nil
get constructed. So the numbering of Nil
and Cons
gets out of sync.
$ ./compiler.exe -c -e ken1.ppl
data _Unit_ = _unit_;
data Unit = unit;
data Terminal = A | B;
data String_inst0 = Nil_inst0 | Cons_inst0 Terminal String_inst0;
data String_inst1 = Nil_inst1 | Cons_inst1 Terminal String_inst1;
data Bool = False | True;
define prefix_inst0 : String_inst0 -> String_inst1 -> Unit = \ xs : String_inst0. \ ys : String_inst1. case xs of Nil_inst0 -> unit | Cons_inst0 x xs0 -> (case ys of Nil_inst1 -> unit | Cons_inst1 y ys0 -> (if x == y then prefix_inst0 xs0 ys0 else sample fail : Unit));
define inp_inst0 : String_inst1 = Cons_inst1 A (Cons_inst1 A (Cons_inst1 B Nil_inst0));
define gen_inst0 : String_inst0 = Cons_inst0 (sample uniform : Terminal) gen_inst0;
prefix_inst0 gen_inst0 inp_inst0
A more minimal example.
$ cat bug.ppl
data Nat = Zero | Succ Nat;
define f = \a. \b. False;
f Zero (Succ Zero)
$ ./compiler.exe -c -e bug.ppl
data _Unit_ = _unit_;
data Nat_inst0 = Zero_inst0 | Succ_inst0 Nat_inst0;
data Nat_inst1 = Zero_inst1 | Succ_inst1 Nat_inst1;
data Bool = False | True;
define f_inst0 : Nat_inst0 -> Nat_inst1 -> Bool = \ a : Nat_inst0. \ b : Nat_inst1. False;
f_inst0 Zero_inst0 (Succ_inst0 Zero_inst1)
The bug is in this line at Monomorphize.hs:196:
xis'' = fmap (\ tiss -> Map.fromList (zip (Set.toList tiss) [0..])) (semimap xis')
If a constructor never gets called, then it gets a different list of numbers from the datatype itself.
However, I'm not sure yet what the best fix is.
Okay, I think I've narrowed down the issue to this: xis''
is a dict like this:
{Nat: [tag1, tag3],
Succ: [tag3],
Zero: [tag1, tag3]}
Now when instantiating, Monomorphize renames all occurrences of Nat¹
to Nat_inst0
(because tag1
is at index 0 in the list) and Nat³
to Nat_inst1
. Similarly for Zero
. However, as Prof. Chiang noted, Succ
only ever gets called as Succ³
so that becomes Succ_inst0
. So with this process, it should create the datatypes:
data Nat_inst0 = Zero_inst0 | Succ_inst???;
data Nat_inst1 = Zero_inst1 | Succ_inst0;
Which is admittedly very confusing. Currently, the code assumes that the constructors and their datatypes always have the same instantiations, which is not necessarily true (i.e. when creating Nat_instX
, it naively renames the constructors Zero_instX
and Succ_instX
). So I'm going to override the constructors' instantiations with the same ones as their datatypes, so that the constructor _instX
s always match up with the datatype name _instX
s. I think this should fix the bug and generate the expected datatype and constructor names.
This bug is triggered by the following program (and not if
Cons B Nil
is replaced byCons B inp
):Running
compiler.exe -c
produces a program withwhich has the type error
Failed to unify _FoldString_inst1_ and String_inst0, in the definition _applyString_inst1_, in the expression Cons_inst1 B
. It seems that the definitionis missing a case for the
Nil
in the original definition ofinp
.