diprism / perpl

The PERPL Compiler
MIT License
10 stars 5 forks source link

Tests / examples #13

Closed davidweichiang closed 1 year ago

davidweichiang commented 2 years ago

I don't know what kind of testing framework is usually used, but it would be good to have a bunch of example programs and their corresponding FGGs.

davidweichiang commented 2 years ago

1d10319

davidweichiang commented 2 years ago
davidweichiang commented 2 years ago

The only way to compute an expectation would be to compute the full distribution, so it can't be an expectation over natural numbers.

davidweichiang commented 2 years ago

I added an example for the question "what's the probability that HTH comes before HHT"? Weirdly, this has a different answer in general from the question "which has the lower expected number of flips"?

ccshan commented 2 years ago

We can compute the expectation by processing the result Nat with the following function:

define weigh = \n.
  case n of | Zero -> sample fail
            | Succ n -> amb unit (weigh n);
ccshan commented 2 years ago

I made autodiff work with this patch, which probably breaks things:

diff --git a/bin/sum_product.py b/bin/sum_product.py
index 71a45fa..82def82 100755
--- a/bin/sum_product.py
+++ b/bin/sum_product.py
@@ -27,5 +27,19 @@ if __name__ == '__main__':

     for el, fac in fgg.interp.factors.items():
         fac.weights = torch.as_tensor(fac.weights, dtype=torch.get_default_dtype())
-    
-    print(json.dumps(fggs.formats.weights_to_json(fggs.sum_product(fgg, method=args.method))))
+
+    for name, weights in args.weights:
+        el = fgg.grammar.get_edge_label(name)
+        fgg.interp.factors[el].weights.requires_grad_()
+
+    sp = fggs.sum_product(fgg, method=args.method)
+    if args.weights:
+        sp.backward(torch.sparse_coo_tensor([[0] for _ in sp.size()],
+                                            [1], sp.size()).to_dense())
+    print(json.dumps(fggs.formats.weights_to_json(sp)))
+    print(json.dumps({
+        name: fggs.formats.weights_to_json(
+                  fgg.interp.factors[fgg.grammar.get_edge_label(name)]
+                  .weights.grad)
+        for name, weights in args.weights
+    }))

Now coin_unfair.ppl produces {"unfair": [2.3809525966644287, 2.3809525966644287]}, and the fact that the two numbers are equal means that indeed the derivative of the von Neumann example is zero.

If I remove .to_dense() above, then there's always an addition error that I can fix, and sometimes a multiplication error that I don't know how to fix (especially since printing the tensors involved seems to make the error go away)...

davidweichiang commented 2 years ago

The weigh trick sounds like it would work. Have you tried it?

We could also compute expectations using gradients (gradient of log Z with respect to log factor = expected count of factor).

davidweichiang commented 2 years ago

Why do you need to use sparse tensors?

davidweichiang commented 2 years ago

@ccshan Please try out https://github.com/diprism/fgg-implementation/pull/147 for computing both gradients and expectations.

ccshan commented 2 years ago

Why do you need to use sparse tensors?

Probably just premature optimization on my part...

The weigh trick sounds like it would work. Have you tried it?

Yes! I got 7 and 5 out of code_{hth,hht}.ppl

davidweichiang commented 2 years ago

I just pushed fixes and renamed to pattern1.ppl. (The right answers should be 10 and 8 -- it was an off-by-3 error!)

There's also a pattern2.ppl that uses a different method for expectations, and gives the same answer.

davidweichiang commented 2 years ago

Should we delete examples that are no longer interesting? Or maybe split code/ into tests/ and examples/?

davidweichiang commented 2 years ago

What mechanism(s) can we add to check that the tests actually produce the correct output?

davidweichiang commented 2 years ago

Maybe the third option won't be so bad to implement if we limit to linear equations? Is there a standard linear solver?