namin / clpsmt-miniKanren

CLP(SMT) on top of miniKanren
MIT License
35 stars 8 forks source link

Confirm a bug about `declare-const` and `==` #10

Open chansey97 opened 3 years ago

chansey97 commented 3 years ago

Executing the following program crash:

(run 2 (q)
      (z/ `(declare-const ,q Int))
      (== q 6))
Exception in call-z3: error in z3 out.smt > out.txt

smt.out

(declare-const 6 Int)
(check-sat)
(exit)

It seems that z/reify-SM mistakenly grounds SMT variables.

PS. smt-assumptions no this problem.

I can fix this issue, but before that I must confirm it indeed a bug.

I am not sure this is a bug because

  1. (z/ `(declare-const ,q Int)) may expect users to use (z/assert (= ,q 6)) instead of (== q 6).
  2. It can pass most of test cases, e.g. full-interp-extended-tests, full-abstract-interp-extended-tests, etc. (surprising!).

Thanks.

namin commented 3 years ago

So instead of grounding SMT variables, you would assert their values?

chansey97 commented 3 years ago

It seems that z/reify-SM mistakenly grounds SMT variables.

I mean z/reify-SM should ignore these grounded declares. See below.

The initial motivation is that I'd like to write an add1o by clpsmt-miniKanren and runforwards and backwards.

(define add1o
  (lambda (n out)
    (z/assert `(= ,out (+ ,n 1)))))

(run 5 (q)
    (fresh (n out)
      (add1o n out)
      (== q `(,n ,out))))

(run 5 (q)
    (fresh (n)
      (add1o n 1)
      (== q n)))

(run 5 (q)
    (fresh (out)
      (add1o 1 out)
      (== q out)))

;; '((0 1) (1 2) (-1 0) (2 3) (-2 -1))
;; '(0)
;; '(2)

Nice, no problem!

Then I'd like to define add1o by declaring n and out as Int explicitly:

(define add1o
  (lambda (n out)
    (fresh (n^ out^)
           (z/ `(declare-const ,n^ Int))
           (z/ `(declare-const ,out^ Int))
           (== n^ n)
           (== out^ out)
           (z/assert `(= ,out^ (+ ,n^ 1))))))

However the following code complains:

(run 5 (q)
    (fresh (n)
      (add1o n 1)
      (== q n)))
Exception in call-z3: error in z3 out.smt > out.txt

out.smt

(declare-const 1 Int) <---- invalid declare syntax!
(declare-const _.0 Int)
(assert (= 1 (+ _.0 1)))
(check-sat)
(exit)

The root cause it that z/reify-SM mistakenly use these invalid declares.

There was no problem in the earlier version, because at that time you filtered out these invalid declares. But I don’t know why you commented it out later?

See this revision and I also do not understand the log message:

  • do not assume unification of smt vars
namin commented 3 years ago

I don't quite remember. From the commit message, it sounds like I expect users to use z/assert instead of =. I don't see why this line hurts. We can uncomment it, I think? Can you try and see if anything breaks? Thanks for the detective work!

chansey97 commented 3 years ago

If we uncomment it, the add1o test will be OK. (Although there is a better way to fix, I may create a new pull-request, anyway...).

However, this issue is not only whether a bug or not, but also involves a more deep thing...

As I said in this thread

users seemly need to learn some technique for working around this restriction...

Even if we don't fix this issue, we still can let add1o OK, but using a different pattern:

(define add1o
  (lambda (n out)
    (z/assert `(= ,out (+ ,n 1)))))

(run 5 (q)
     (fresh (n out)
            (z/ `(declare-const ,n Int))
            (z/ `(declare-const ,out Int))
            (add1o n out)
            (== q `(,n ,out))))

(run 5 (q)
     (fresh (n)
            (z/ `(declare-const ,n Int))
            (add1o n 1)
            (== q n)))

(run 5 (q)
     (fresh (out)
            (z/ `(declare-const ,out Int))
            (add1o 1 out)
            (== q out)))

;; '((0 1) (1 2) (-1 0) (2 3) (-2 -1))
;; '(0)
;; '(2)

That is hoping users place their declares at the top as much as possible (here, we lifted declare-const to the caller position).

So your log message may hope users use this pattern.

I'm not certain that would decrease relational property or lose expressiveness? (PS. the declare-const is original not very relational, because we can't reorder the variable's declaraing and using, but that's another story.)

The reason why the current version can pass many test cases might be the usage of this pattern + declaring integer by default.

chansey97 commented 3 years ago

BTW, Int is special in the current version.

What do you think about deleting this feature and rewriting all the test cases?

If it is not fully relational, it should raise errors as much as possible.

Is there any difficulty to write these test cases if we have no "integer by default"?

chansey97 commented 3 years ago

In the previous comment, I used an example:

(define add1o
  (lambda (n out)
    (fresh (n^ out^) ; NOTE: n^ out^ are logic variables
           (z/ `(declare-const ,n^ Int))
           (z/ `(declare-const ,out^ Int))
           (== n^ n)
           (== out^ out)
           (z/assert `(= ,out^ (+ ,n^ 1))))))

This example is verbose, because I thought z/declare-const must be applied to a logic variable. Here n and out may not be logic variables, so I have to fresh n^ and out^. But the current clpsmt-miniKanren assumes that z/declare-const can only be applied to be a fresh logic variable and that fresh logic variable will never be unified with a ground value by ==. This is why it crashes.

However, ideally, z/declare-const can be applied to not only logic variables but also ground values. It just declare a logic variable or a ground value in a specific domain.

Consider the following program in cKanren:

(run* (q)
      (infd 1 (range 0 9)))  ; NOTE: 1 is a ground value
;; (_.0) 

However, in the current clpsmt-miniKanren:

;; Apply z/declare-const to a ground value
(run 2 (q)
     (z/ `(declare-const 1 Int)))
Exception in call-z3: error in z3 out.smt > out.txt

;; Apply z/declare-const to a logic variable, then that logic variable will unify with a ground value
(run 2 (q)
      (z/ `(declare-const ,q Int))
      (== q 6))
Exception in call-z3: error in z3 out.smt > out.txt

"Fixing" this issue can make the behavior more like cKanren, except that (z/ `(declare-const ,X Int)) has no responsibility to check type.

I hope we can write the following code in the future:

(define add1o
  (lambda (n out) ; NOTE: n and out can be logic variables or ground values
    (fresh ()
           (z/ `(declare-const ,n Int))
           (z/ `(declare-const ,out Int))
           (z/assert `(= ,out (+ ,n 1))))))

I am still not sure whether it is a bug because we can z/declare fresh variables at the top as much as possible.

Frankly, I am now more inclined that it is not a bug. Indeed the semantics of z/declare in clpsmt-miniKanren and infd in cKanren are slightly different. I think we can keep the current status (if I see break, I will reopen).

namin commented 3 years ago

Thanks for your thoughts and examples. To proceed with a design, I think maybe it makes sense to distinguish between a low-level API, where one uses declare-const and assert and a higher level API that looks more like mk. What do you think? I think we need to redesign things a bit, and it would be good if it was driven by example.

chansey97 commented 3 years ago

@namin I think there is no serious problem in current design, but if users does not know some preset assumptions, they may be trapped.

For example, the relation (equalo x y out). No one know the parameter out is a SMT variable or not. If the equalo internally use out as a SMT variable, then the caller need z/decare it first. If not, they can apply equalo directly.

namin commented 3 years ago

But I don't think it needs to be declared first, because we can reorder the smt commands before passing them to the solver.

But there is still the problem that we need a declare.

If we don't have a default type, we need to infer the type, which I am not sure how we would do.

If we have a default type, then we can just use that for implicit declares -- I think that's what I did.

Maybe we can have the convention that when you use a variable, you declare it first if you know the type. So for example, equal could declare out.

If you don't know the type, then what do you do?

chansey97 commented 3 years ago

it would be good if it was driven by example.

I agree. I think I can summarize some error-prone examples.

But I don't think it needs to be declared first, because we can reorder the smt commands before passing them to the solver.

The following is the example I just met.

(define all-diffo
  (lambda (l)
    (conde
     ((== l '()))
     ((fresh (t v)
             (symbolo t)
             (== l `((,t ,v)))))
     ((fresh (t1 t2 v1 v2 xs)
             (symbolo t1)
             (symbolo t2)
             (== l `((,t1 ,v1) (,t2 ,v2) . ,xs))
             (conde
              ((== t1 t2)  ; two tags are equal
               (conde
                ((== t1 'bv3) ; tags are bitvec3
                 (z/assert `(not (= ,v1 ,v2))))
                ((=/= t1 'bv3)  ; tags are bitvec3
                 (=/= v1 v2))))
              ((=/= t1 t2)))  ; two tags are not equal
             (all-diffo `((,t1 ,v1) . ,xs))
             (all-diffo `((,t2 ,v2) . ,xs)))))))

The all-diffo ensures that all variables and values found within the flat list are different from each other. I use tags to maximize the use of SMT.

1st test:

(run 5 (q)
     (all-diffo `((bv3 bitvec-100) (bv3 bitvec-111) (symb apple))))
;; '(_.0)

It is no problem, because bitvec-100 and bitvec-100 are constant literal which compatible with (z/assert `(not (= ,v1 ,v2))).

2nd test:

(run 5 (q)
     (all-diffo `((symb apple) (bv3 bitvec-010) ,q)))
;; Exception in call-z3: error in z3 out.smt > out.txt

out.smt

(declare-fun _.0 () Int)
(assert (not (= #b010 _.0)))
(check-sat)
(exit)

The reason is that we do not z/ declare BitVec at all and clpsmt-miniKanren attempts to use "integer by default". We cannot z/declare q because we don't know q's type. So we must z/declare inside all-diffo:

(define all-diffo
  (lambda (l)
    (conde
     ((== l '()))
     ((fresh (t v)
             (symbolo t)
             (== l `((,t ,v)))))
     ((fresh (t1 t2 v1 v2 xs)
             (symbolo t1)
             (symbolo t2)
             (== l `((,t1 ,v1) (,t2 ,v2) . ,xs))
             (conde
              ((== t1 t2)  ; two tags are equal
               (conde
                ((== t1 'bv3) ; tags are bitvec
                 (z/ `(declare-const ,v1 (_ BitVec 3)))
                 (z/ `(declare-const ,v2 (_ BitVec 3)))
                 (z/assert `(not (= ,v1 ,v2))))
                ((=/= t1 'bv3)
                 (=/= v1 v2))))
              ((=/= t1 t2)))  ; two tags are not equal
             (all-diffo `((,t1 ,v1) . ,xs))
             (all-diffo `((,t2 ,v2) . ,xs)))))))

However this will cause the 1st test crash (v1 and v2 are logic variable unified with ground values, that breaks a preset assumption!). So we must z/declare these constant literals as SMT variables:

(run 5 (q)
     (fresh (a b)
            (z/ `(declare-const ,a (_ BitVec 3)))
            (z/ `(declare-const ,b (_ BitVec 3)))
            (z/assert `(= ,a bitvec-100))
            (z/assert `(= ,b bitvec-111))
            (all-diffo `((bv3 ,a) (bv3 ,b) (symb apple) ,q))
            ))

Then OK. clpsmt-miniKanren will internally remove these duplcate z/decalres.

So we get another preset assumption: all the SMT constant literals should be z/declared and z/assert as SMT variables.

namin commented 3 years ago

(v1 and v2 are logic variable unified with ground values, that breaks a preset assumption!)

I don't understand this. Can you elaborate?

I don't think we want the user to write and use all-diffo this way. Do you agree?

chansey97 commented 3 years ago

I don't understand this. Can you elaborate?

I mean the current clpsmt-miniKanren assumes that z/declare-const can only be applied to be a fresh logic variable and that fresh logic variable will never be unified with a ground value by ==.

I don't think we want the user to write and use all-diffo this way. Do you agree?

I agree.

We can rewrite it:

(define all-diffo
  (lambda (l)
    (conde
     ((== l '()))
     ((fresh (t v)
             (symbolo t)
             (== l `((,t ,v)))))
     ((fresh (x1 x2 xs)
             (== l `(,x1 ,x2 . ,xs))
             (equalo x1 x2 #f)
             (all-diffo `(,x1 . ,xs))
             (all-diffo `(,x2 . ,xs)))))))

The equalo still uses tags internally.

namin commented 3 years ago

I mean the current clpsmt-miniKanren assumes that z/declare-const can only be applied to be a fresh logic variable and that fresh logic variable will never be unified with a ground value by ==.

So in that case, do we just want to omit declare-const here? Like without the ; in the current code?

chansey97 commented 3 years ago

So in that case, do we just want to omit declare-const here? Like without the ; in the current code?

I don't quite understand.

As I mentioned above, the caller site may synthesize a q which packed in some tag we haven't known. So we must call declare-const inside all-diffo.

namin commented 3 years ago

I mean, would the change to omit declarations for non-variables be useful here?

chansey97 commented 3 years ago

I mean, would the change to omit declarations for non-variables be useful here?

Yes, then everything fine, e.g. uncomment this line.

But we can't say the current code is wrong (just more verbose for users...).

namin commented 3 years ago

Any downside to not uncommenting out this line?

Do you want to send me a pull request along with a regression test?

Thanks.

chansey97 commented 3 years ago

Any downside to not uncommenting out this line?

There are two downsides AFAIK for uncommenting out this line:

  1. There is small problem for Bool, because that line will filter symbol?, but true / false are already symbol? and can be a identifer in smt format. I can give you a new patch to fix. So it is not a real downside.

  2. z/ declare-const currently has no responsibility to check type, e.g. (z/ `(declare-const 'a Int)) will success. This also can be easily fixed I think though. In fact, we cannot encounter the wrong type, unless the programmer write the wrong code. (So I think there is no need to fix)

namin commented 3 years ago

I see.

Good catch for 1.

For 2, I guess we don’t have to consider type failures as query failures. Maybe we should?

chansey97 commented 3 years ago

Sorry, you asked "any downside to not uncommenting out this line?"

No downside AFAIK, except verbose as I said above.

not uncommenting means keep current.

namin commented 3 years ago

For 1, don’t we have a problem with symbol types as well?

namin commented 3 years ago

Sorry, you asked "any downside to not uncommenting out this line?"

No downside AFAIK, except verbose as I said above.

not uncommenting means keep current.

I meant uncommenting.

chansey97 commented 3 years ago

For 1, don’t we have a problem with symbol types as well?

Yes,

namin commented 3 years ago

For 1, don’t we have a problem with symbol types as well?

Yes,

Sounds like we need to filter before reification.

chansey97 commented 3 years ago

Sounds like we need to filter before reification.

Yes, just

(M (walk* (reverse M) S))
(M (remp (lambda (x) (and (declares? x) (not (var? (cadr x))))) M)) ;<--- add this
(S (reify-S M '()))

and fix a small bug of declares?.

chansey97 commented 3 years ago

For 2, I guess we don’t have to consider type failures as query failures. Maybe we should?

I think there should be no type failures, unless the programmer write the wrong code.

namin commented 3 years ago

For 2, I guess we don’t have to consider type failures as query failures. Maybe we should?

I think there should be no type failures, unless the programmer write the wrong code.

OK, but we shouldn't make it easy to write the wrong code. :)

namin commented 3 years ago

Sounds like we need to filter before reification.

Yes, just

(M (walk* (reverse M) S))
(M (remp (lambda (x) (and (declares? x) (not (var? (cadr x))))) M)) ;<--- add this
(S (reify-S M '()))

and fix a small bug of declares?.

Sounds good. Go for it.

chansey97 commented 3 years ago

Do you want to send me a pull request along with a regression test?

There is another downside I didn't think of yesterday.

There may be some performance issue:

  1. Because of that new line, each time we call z/reify-SM, it will traverse M one more time. (Comparing with SMT solvers this may not be a big problem though.)

  2. The current SMT hook implementation accumulates z/ declare-const blindly (which including a potential duplicate problem of declare-const). It might cause M longer and longer.

    For example,

    (run 5 (q)
        (z/ `(declare-const ,q (_ BitVec 3)))
        (z/ `(declare-const ,q (_ BitVec 3)))
        (z/assert `(= ,q bitvec-100))
        (λ (c) (printf "M ~a\n" (c->M c)) c)
    
        (z/ `(declare-const ,q (_ BitVec 3)))
        (z/ `(declare-const ,q (_ BitVec 3)))
        (z/assert `(= ,q bitvec-100))
        (λ (c) (printf "M ~a\n" (c->M c)) c))
    
    ;; M ((assert (= #(q) bitvec-100))
    ;;    (declare-const #(q) (_ BitVec 3))
    ;;    (declare-const #(q) (_ BitVec 3)))
    
    ;; M ((assert (= #(q) bitvec-100))
    ;;    (declare-const #(q) (_ BitVec 3))
    ;;    (declare-const #(q) (_ BitVec 3))
    ;;    (assert (= #(q) bitvec-100))
    ;;    (declare-const #(q) (_ BitVec 3))
    ;;    (declare-const #(q) (_ BitVec 3)))

    If we use this patch, those useless non-variables z/declares will met the similar duplicate problem. For example,

    (define (test1 x)
     (fresh ()
            (z/ `(declare-const ,x (_ BitVec 3)))
            (z/assert `(= ,x bitvec-100))
            (λ (c) (printf "test1 M ~a\n" (c->M c)) c)
            (test2 x)
            ))
    
    (define (test2 x)
     (fresh ()
            (z/ `(declare-const ,x (_ BitVec 3)))
            (z/assert `(= ,x bitvec-100))
            (λ (c) (printf "test2 M ~a\n" (c->M c)) c)
            ))
    
    (run 5 (q)
       ;; Now we can write the SMT constant literals directly and move `z/ declare-const` into relation
        (test1 'bitvec-100))
    
    ;; test1 M ((assert (= bitvec-100 bitvec-100))
    ;;          (declare-const bitvec-100 (_ BitVec 3)))
    
    ;; test2 M ((assert (= bitvec-100 bitvec-100))
    ;;          (declare-const bitvec-100 (_ BitVec 3))
    ;;          (assert (= bitvec-100 bitvec-100))
    ;;          (declare-const bitvec-100 (_ BitVec 3)))

    If we keep current, the code above will crash. That means we force users to z/ declare-const at the top. Then users may realize that they can remove the duplicate z/ declare-consts inside relations:

    (define (test1 x)
     (fresh ()
            (z/assert `(= ,x bitvec-100))
            (λ (c) (printf "test1 M ~a\n" (c->M c)) c)
            (test2 x)
            ))
    
    (define (test2 x)
     (fresh ()
            (z/assert `(= ,x bitvec-100))
            (λ (c) (printf "test2 M ~a\n" (c->M c)) c)
            ))
    
    (run 5 (q)
        ;; z/ declare-const at the top
        (z/ `(declare-const ,q (_ BitVec 3)))
        (z/assert `(= ,q bitvec-100))
        (test1 q))
    
    ;; test1 M ((assert (= #(q) bitvec-100))
    ;;          (assert (= #(q) bitvec-100))
    ;;          (declare-const #(q) (_ BitVec 3)))
    
    ;; test2 M ((assert (= #(q) bitvec-100))
    ;;          (assert (= #(q) bitvec-100))
    ;;          (assert (= #(q) bitvec-100))
    ;;          (declare-const #(q) (_ BitVec 3)))

    Of course, even using this patch, we can also write the code above, but there will be no "crash prompt".

    Frankly, I'm not certain that would significantly decline performance.

    I think we can just memo it, but keep the current status.

namin commented 3 years ago

I think those two issues are OK. Instead of memo, we can consolidate declarations when we add them.

chansey97 commented 3 years ago

Instead of memo

The "memo" here, I mean record this issue in TODO LIST temporarily.

we can consolidate declarations when we add them.

I agree. We can let z/declares no duplicate in M.

namin commented 3 years ago

A TODO file is a good idea. You can start a TODO.md file in the repo, with checkboxes like

Thanks.