Checking the behavior of translated code

aefree2 commented 3 months ago

I am trying to use Coq-of-Python to translate a very basic python program into coq.

from typing import List
def has_close_elements(numbers: List[float], threshold: float) -> bool:
  """Check if in given list of numbers, are any two numbers closer to each other than given threshold"""
  for idx, elem in enumerate(numbers):
    for idx2, elem2 in enumerate(numbers):
      if idx != idx2:
        distance = abs(elem - elem2)
        if distance < threshold:
          return True
        return False

However, I am having trouble verifying if the resulting coq code is a faithful translation of the original Python. I am struggling to get unit tests to check the behavior of the new Coq code working. assert candidate([1.0, 2.0, 3.9, 4.0, 5.0, 2.2], 0.3) == True

Coq-of-Python wants to translate assert statements into:

Definition check : Value.t -> Value.t -> M :=
 fun (args kwargs : Value.t) =>
  let- locals_stack := M.create_locals locals_stack args kwargs [ "has_close_elements" ] in
  ltac:(M.monadic (
  let _ := M.assert (| Compare.eq (|
  M.call (|
   M.get_name (| globals, locals_stack, "has_close_elements" |),
   make_list [
    make_list [
     Constant.float "1.0";
     Constant.float "2.0";
     Constant.float "3.9";
     Constant.float "4.0";
     Constant.float "5.0";
     Constant.float "2.2"
    ];
    Constant.float "0.3"
   ],
   make_dict []
  |),
  Constant.bool true
 |) |)

However, this compiles regardless of if the Constant.bool that it is being compared against is true or false. I am wondering how one is supposed to verify that the resulting coq program has the correct behavior?

clarus commented 3 months ago

Hello, so first of all, the coq-of-python project is still a work in progress. For your example, this is expected that the assert example compiles as I understand. Maybe you want to instead run this example to make it fail?

For now, the semantics is optimized to make proofs but there are no ways to execute the code on a precise parameter. This is one of our goals at mid-term (a few months) to be able to compare our Python semantics with the behavior of the Python interpreter.

aefree2 commented 3 months ago

When you say that the code is optimized to make proofs, do you mean that the code is not yet ready to have properties proven about it, or do you mean that the tests instead have to be in the Example-Proof unit test format?

Example of what I mean when I say example-proof unit test format:

Example t1 :
    Compare.eq (| (has_close_elements (make_list [make_list [
          Constant.float "1.0";
          Constant.float "2.0";
          Constant.float "3.9";
          Constant.float "4.0";
          Constant.float "5.0";
          Constant.float "2.2"
        ];
        Constant.float "0.3"
      ])),
    (Constant.bool true) |).
  Proof. reflexivity. Qed.

When I try to run this particular test, I get the error:

Error:
The term
 "has_close_elements
    (make_list
       [make_list
          [Constant.float "1.0"; Constant.float "2.0"; 
           Constant.float "3.9"; Constant.float "4.0"; 
           Constant.float "5.0"; Constant.float "2.2"]; 
        Constant.float "0.3"])" has type "Value.t -> M"
while it is expected to have type "Value.t".

Are you saying that there isn’t currently a solution to make this code run?

clarus commented 3 months ago

When you say that the code is optimized to make proofs, do you mean that the code is not yet ready to have properties proven about it, or do you mean that the tests instead have to be in the Example-Proof unit test format?

There are no evaluation functions, but there is a (beginning) of definition for the semantics with the Run.t predicate that is defined in https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/simulations/proofs/CoqOfPython.v So the reflexivity tactic will not evaluate an expression, but it is still possible to evaluate terms by hand by going into interactive mode and using the primitives of the Run.t predicate.

Are you saying that there isn’t currently a solution to make this code run?

Yes, currently, we do not have a solution to run this example with only reflexivity. We are open to further discussion or collaboration.

clarus commented 3 months ago

Note that the evaluate function in https://github.com/formal-land/coq-of-python/blob/main/CoqOfPython/simulations/proofs/CoqOfPython.v evaluates a trace built with Run.t. This is presented in https://formal.land/blog/2024/05/22/translation-of-python-code-simulations-from-trace but there is no more documentation.

formal-land / coq-of-python

Checking the behavior of translated code #32