pjljvandelaar commented 4 years ago

During a meeting with AdaCore, Nexperia and ESI, we were asked by AdaCore to submit this issue to trigger a discussion to determine whether a RFC is justified.

Support for `Properties` in Ada and SPARK

Current Situation

Ada supports test case specification with GNATTest. However, these specifications are tool specific. For example, the following test cases of the Replace_Prefix function are specified for GNATTest and can't be proven by SPARK to hold.

   Test_Case => ("Empty New Prefix and Remainder", Nominal),
   Test_Case => ("All Empty", Nominal),
   Test_Case => ("String not start with Prefix, String length larger than Prefix length", Robustness),
   Test_Case => ("String not start with Prefix, String length equal to Prefix length", Robustness),

Other languages support the specification of examples in documentation which are automatically tested or proven as part of building and releasing the code. See for example doctest.
- For example, Add(3,4) = 7
Other languages support the specification of test cases (not just their names as in the GNATTest example above)
- For example, Add(3,-4) = -1 and Replace_Prefix("bla", "", "new") = "newbla"
Other languages support the specification of properties.
- For example, for all I : Integer => Add(I,0) = I and for all p,n,r : String => Replace_Prefix(p&r, p,n) = n&r
SPARK support GHOST code.
SPARK supports multiple problem solvers. A user can choose to either check the code using one or all problem solvers.

Desired situation

We would like to specify properties of the code in a tool independent way.
We would like to treat simple properties, such as examples and test cases, the same as complicate properties, e.g. with for all statements.
We might want to use GHOST code to keep the specifications of properties concise and readable.
Properties should be automatically verified either as part of compilation or as a seperate phase before releasing.
We would like to specify a hierarchy of tools: provide alternatives in case a tool is unable to verify a property. For example, if SPARK can't proof a particular property, a test case generator is used to verify that property by generating a large number of test cases with random inputs.
We would like to specify the set of hierarchy of tools to verify the properties. Confidence of the verification increases when multiple test case generators with different startegies are used. A similar requirement for SPARK resulted in the possibility to use multiple problem solvers.

yannickmoy commented 3 years ago

@pjljvandelaar Pragma Test_Case allows you to specify properties that should hold on entry to the test (the optional Requires part) and properties that should hold on exit to the test (the optional Ensures part).

Currently, the proof tool called GNATprove which analyzes SPARK code ignores Test_Case pragmas/aspects. But this is really a tool issue more than a language issue. You should contact AdaCore if you want GNATprove to support Test_Case.

Regarding more complex properties, I think we routinely specify and prove such properties with ghost code in SPARK. The way to do that is to state a lemma, which is a procedure with a precondition stating the conditions for the lemma to apply, and a postcondition stating the property you want to prove. Then, you can use GNATprove to prove that lemma, using ghost code for more complex lemmas. See this section of the SPARK User's Guide about that process: http://docs.adacore.com/live/wave/spark2014/html/spark2014_ug/gnatprove_by_example/manual_proof.html#manual-proof-examples

Some of the other requirements you had looked like tool enhancements, which you should report to AdaCore. Can you clarify in which case you would need a language feature instead, given my answers above?

yakobowski commented 3 years ago

@pjljvandelaar do you have an opinion on Yannick's answers?

pjljvandelaar commented 3 years ago

I consider Yannick's answer quite tool related due to statement such as

"the proof tool called GNATprove which analyzes SPARK code ignores"
"I think we routinely specify and prove such properties with ghost code in SPARK."

I would like properties to be really tool independent: to be really a language feature!

I don't want to add Pre and Post conditions to use just SPARK. I want to document my code with Properties, thus at least examples, standard Test cases, Pre and Post conditions, and Invariants.

And as a consequence be able to

proof (parts) with Spark [Note Spark might even be able to prove standard test cases]
generate test cases (like QuickCheck and TorXakis) using the pre and post-conditions and properties.
test the code: both the standard test cases and the generate test cases
test the documentation: check that the examples are correct
generate complete documentation.

If I specify the property x == reverse(reverse(x)), I want that any tool can use this information to help me ensure the quality of the software.

Spark could try to proof it.
A test case generator could generate a large variety of values for x and check that the property indeed holds for all those values.
A documentation generator can generate this a part of the documentation, such that my users understand the functionality better.

Similar if I specify "Pre => f(x)"

Spark could proof that all calls satisfy this pre condition
A tool (such as the compiler) can generate code that asserts this condition before executing the function body
A test case generator can generate inputs that adhere to this condition
A documentation generator can generate this a part of the documentation, such that my users will more likely satisfy the pre condition.

pjljvandelaar commented 3 years ago

To make my request even clearer, I describe my envisioned specification of the function Add:

function Add(X,Y : Integer) return Integer
 with
     Example 'Simple' => Add(3,4) = 7
     Example 'Negative Value' => Add(3,-4) = -1
     Post => Add'Result = X + Y
     Invariant 'Identity Element' => for all I : Integer => Add(I,0) = I
     Invariant 'Inverse' => for all I : Integer => Add(I,-I) = 0
     Invariant 'Commutative'=> for all A,B : Integer => Add(A,B) = Add(B,A)
     Invariant 'Associative'=> for all A,B,C : Integer => Add(Add(A,B),C) = Add(A, Add(B,C))

I leave it up to the tooling to decide what to do with this additional information! Maybe a symbolic code rewriter can change Add(Add(X,Y),-X) to just Y based on this information!

yannickmoy commented 3 years ago

Hi @pjljvandelaar, what you're describing can readily be described by lemmas today:

procedure Add_Identity_Element (I : Integer) 
  with Ghost,
         Post => Add (I, 0) = I and then Add (0, I) = I;

procedure Add_Commutative (A, B : Integer)
  with Ghost,
         Post => Add (A, B) = Add (B, A);

procedure Add_Associative (A, B, C : Integer)
  with Ghost,
         Post => Add (Add (A, B), C) = Add (A, Add (B, C));

This is the extrinsic expression of a property instead of the intrinsic expression used in a postcondition. Here you're suggesting to use an aspect for this extrinsic expression of the property, that does not seem to bring such advantages to me compared to the use of lemmas as above (which reduce the use of quantifiers).

pjljvandelaar commented 3 years ago

Hi @yannickmoy,

First of all, I was not aware of the extrinsic expression of a property for GNATprove. I like it, and I will definitely going use it. However, this solution distributes information about the Add function. For some tools this is irrelevant. E.g. a test case generator will work perfectly. Yet for others, this has undesired consequences: a documentation generator, like GNATdoc, will be unable to provide me and my users with all the properties of the Add function.

clairedross commented 3 years ago

This is the extrinsic expression of a property instead of the intrinsic expression used in a postcondition. Here you're suggesting to use an aspect for this extrinsic expression of the property, that does not seem to bring such advantages to me compared to the use of lemmas as above (which reduce the use of quantifiers).

And works even when the function takes as parameters complex structures on which quantification is not possible.

-- Claire Dross Senior Software Engineer, AdaCore

yannickmoy commented 3 years ago

@pjljvandelaar For documentation, you need in any case to do something special to recover this information in a readable format. Just showing the aspect is not the best result. So here I would recommend using some conventions to help group the information. Exactly like you would do in a development in the Coq (or others) proof assistant, where you state lemmas after some definitions.

pjljvandelaar commented 3 years ago

We all seem to agree that users of Ada would like to specify pre and post-conditions and invariants of a function. We seem to disagree what is the best solution. One alternative is to add these features with minimal changes to the existing language. Another alternative is to add these features that leads to the best language in the long term.

The current situation (i.e. the use of lemmas) does imho not lead to the best language in the long term since

Concepts get mixed: a post condition is not an invariant!
Concepts get scattered: invariants of a function are described at multiple another functions.
The definition of procedures & function is extended with ghost procedure without any body. Every analysis tool, including GNATcheck and home-brew analysis on top of libadalang, has to remove this category from their analysis. If not, these added function will generate warnings and errors like, defined procedure without implementation.

@clairedross about the quantifiers: Comparable alternative is of course

function Add(X,Y : Integer) return Integer
 with
     Example 'Simple' => Add(3,4) = 7
     Example 'Negative Value' => Add(3,-4) = -1
     Post => Add'Result = X + Y
     Invariant 'Identity Element' => VALUE I : Integer BEGIN Add(I,0) = I END
     Invariant 'Inverse' => VALUE I : Integer BEGIN Add(I,-I) = 0 END
     Invariant 'Commutative'=> VALUE A,B : Integer BEGIN Add(A,B) = Add(B,A) END
     Invariant 'Associative'=> VALUE A,B,C : Integer BEGIN Add(Add(A,B),C) = Add(A, Add(B,C)) END

And to make Ada even less verbose: one could even define rules about the introduction of values in Invariants, since the types of these values are of course already specified in the declaration of the procedure/function.

pjljvandelaar commented 3 years ago

Or even:

function Add(X,Y : Integer) return Integer
 with
     Example Simple => Add(3,4) = 7
     Example Negative_Value => Add(3,-4) = -1
     Post => Add'Result = X + Y
     Invariant Identity_Element(I : Integer) => Add(I,0) = I
     Invariant Inverse(I : Integer) => Add(I,-I) = 0
     Invariant Commutative(A,B : Integer) => Add(A,B) = Add(B,A)
     Invariant Associative(A,B,C : Integer) => Add(Add(A,B),C) = Add(A, Add(B,C))

All these alternatives can of course be mechanically transformed in the other representations.

yannickmoy commented 3 years ago

Hi @pjljvandelaar, I'm personally not convinced that this is more readable, rather the contrary. Also, this is not how it is done in languages like Coq or Isabelle, where people are used to describing such groups of theorems and lemmas separately from the definition of the function.

yannickmoy commented 3 years ago

Regarding the absence of body, in fact we do provide a body for such lemmas, which may be an empty body (begin null; end;) if you don't need to prove or don't want to prove the property using the auto-active proof method (where you guide automatic provers with ghost code, see https://blog.adacore.com/research-corner-auto-active-verification-in-spark)

clairedross commented 3 years ago

On Fri, Jan 29, 2021 at 8:55 AM pjljvandelaar notifications@github.com wrote:

Or even:

function Add(X,Y : Integer) return Integer with Example Simple => Add(3,4) = 7 Example Negative_Value => Add(3,-4) = -1 Post => Add'Result = X + Y Invariant Identity_Element(I : Integer) => Add(I,0) = I Invariant Inverse(I : Integer) => Add(I,-I) Invariant Commutative(A,B : Integer) => Add(A,B) = Add(B,A) Invariant Associative(A,B,C : Integer) => Add(Add(A,B),C) = Add(A, Add(B,C))

All these alternatives can of course be mechanically transformed in the other representations.

I am not sure why you call these lemmas 'invariants'. I think the usual meaning of an invariant is a property which is preserved by the subprogram, you might have a look at the stable properties of Ada202X. With respect to readability, it can seem like a good idea on such a small and simple example, but I am not sure it would work for bigger lemmas. Also, a lemma can very well speak about several subprograms, like distributivity for example: Mul (Add (A, B), C) = Add (Mul (A, C), Mul (B, C)). Would you put it as an invariant of Mul, Add, both? -- Claire Dross Senior Software Engineer, AdaCore

pjljvandelaar commented 3 years ago

@clairedross Looking back in this thread, I originally called them properties. So indeed Property or Lemma might be a better name than invariant.

@clairedross Some properties are indeed not uniquely coupled to a single function/procedure. Still I would like that a property is linked to the relevant functions, so decode(encode(x)) == x should link to decode and encode. Yet, abs(x) == max(x,-x) should not link to max!

AdaCore / ada-spark-rfcs

Support for `Properties` in Ada & Spark #45

Support for `Properties` in Ada and SPARK

Current Situation

Desired situation

AdaCore / ada-spark-rfcs

Support for `Properties` in Ada & Spark #45

Support for Properties in Ada and SPARK

Current Situation

Desired situation

Support for `Properties` in Ada and SPARK