diprism / perpl

The PERPL Compiler
MIT License
10 stars 5 forks source link

O(n⁶) is no joke #61

Open ccshan opened 2 years ago

ccshan commented 2 years ago

The following program, which tries to use a synchronous CFG to compute the edit distance between two strings with swapping, takes a long time to compile (-c is fast but outputting the FGG hangs). Deleting the Left and Right cases helps, and making the target strings shorter also helps. In the end though, the FGG computation runs out of memory on my laptop unless the target strings are Nil, so I don't know that this issue belongs in this repository.

data String = Nil | Cons Terminal String;
data Action = Stop | Left | Right | Both | Swap;
data Terminal = A | B;

extern act: Action;

define gen: (String -> String) * (String -> String) =
  case act of
  | Stop  -> (\ys. ys, \zs. zs)
  | Left  -> let x = sample uniform in (\ys. Cons x ys, \zs.        zs)
  | Right -> let x = sample uniform in (\ys.        ys, \zs. Cons x zs)
  | Both  -> let x = sample uniform in (\ys. Cons x ys, \zs. Cons x zs)
  | Swap  -> let (l1, r1) = gen in
             let (l2, r2) = gen in
             (\ys. l1 (l2 ys), \zs. r2 (r1 zs));

define equal: String -> String -> Unit = \xs: String. \ys: String.
  case xs of
    | Nil -> (case ys of Nil -> () | Cons _ _ -> sample fail)
    | Cons x xs -> (case ys of
      | Nil -> sample fail
      | Cons y ys -> if x == y then equal xs ys else sample fail);

let (l, r) = gen in
let () = equal (l Nil) (Cons A (Cons A (Cons B (Cons B Nil)))) in
let () = equal (r Nil) (Cons B (Cons B (Cons A (Cons A Nil)))) in
()
ccshan commented 2 years ago

The compiler refunctionalizes the first String input to equal and defunctionalizes the second String input to equal, as desired. The resulting FGG has one big factor called v=((_UnfoldString_inst0_ -> _UnfoldString_inst0_) & (Unit), (_UnfoldString_inst2_ -> _UnfoldString_inst2_) & (Unit))

davidweichiang commented 2 years ago

This compiles in a reasonable amount of time now, but presumably #97 will be needed to get -z to run faster.

ccshan commented 1 year ago

Well, for comparing strings of length 2, the following corrected code now runs within 10 seconds (-d -m fixed-point):

data String = Nil | Cons Terminal String;
data Action = Stop | Left | Right | Both | Swap;
data Terminal = A | B;

define sample_act = amb (factor 0.4 in Stop)
                        (factor 0.1 in Left)
                        (factor 0.1 in Right)
                        (factor 0.3 in Both)
                        (factor 0.1 in Swap);
define sample_term = factor 0.5 in amb A B;

define gen: (String -> String, String -> String) =
  case sample_act of
  | Stop  -> (\ys. ys, \zs. zs)
  | Left  -> let x = sample_term in let (l, r) = gen in (\ys. Cons x (l ys),              r    )
  | Right -> let x = sample_term in let (l, r) = gen in (             l    , \zs. Cons x (r zs))
  | Both  -> let x = sample_term in let (l, r) = gen in (\ys. Cons x (l ys), \zs. Cons x (r zs))
  | Swap  -> let (l1, r1) = gen in
             let (l2, r2) = gen in
             (\ys. l1 (l2 ys), \zs. r2 (r1 zs));

define equal: String -> String -> () = \xs: String. \ys: String.
  case xs of
    | Nil -> (case ys of Nil -> () | Cons _ _ -> fail)
    | Cons x xs -> (case ys of
      | Nil -> fail
      | Cons y ys -> if x == y then equal xs ys else fail);

let (l, r) = gen in
let () = equal (l Nil) (Cons B (Cons A Nil)) in
let () = equal (r Nil) (Cons A (Cons B Nil)) in
()

Also -d -m newton can give the correct answer but generates many warnings like Attempt to unify (A(2)*B(17)*C(17)) and (D(2)*(E(17)*F(17))) indicates index type mismatch. I suspect I should remove the reassociate kludge and instead fix the reshapings that motivated reassociate more carefully.