EasyCrypt / easycrypt

EasyCrypt: Computer-Aided Cryptographic Proofs
MIT License
319 stars 49 forks source link

glob order oddity involving multiple files #614

Open alleystoughton opened 1 month ago

alleystoughton commented 1 month ago

I'm having trouble understanding the new glob order in a case involving multiple files.

The first file, GlobOddity1, is

require import AllCore.

module type T = {
  proc f() : int
}.

module Make(X : T, Y : T) : T = {
  var y : bool
  var x : int
  proc f() : int = {
    var i, j : int;
    i <@ X.f();
    j <@ Y.f();
    return i + j + x;
  }
}.

And the second is

require import AllCore.
require GlobOddity1.

module My = {
  var x : int
  var y : bool
  proc f() : int = {
    if (y) { x <- 1; } else { x <- 2; }
    return x;
  }
}.

module Yours = {
  var x : int
  var y : bool
  proc f() : int = {
    if (y) { x <- 1; } else { x <- 2; }
    return x;
  }
}.

print glob GlobOddity1.Make(My, Yours).
(*
Globals [# = 0]:

Prog. variables [# = 6]:
  My.x : int
  My.y : bool
  Yours.x : int
  Yours.y : bool
  GlobOddity1.Make.x : int
  GlobOddity1.Make.y : bool

note out of order, although not even sure if the `GlobOddity1`
is supposed to be part of the name for sorting purposes
*)

import GlobOddity1.

print glob Make(My, Yours).
(*
Globals [# = 0]:

Prog. variables [# = 6]:
  My.x : int
  My.y : bool
  Yours.x : int
  Yours.y : bool
  Make.x : int
  Make.y : bool

note out of order
*)

If instead I make the entire example be in a single file, I get

require import AllCore.

module type T = {
  proc f() : int
}.

module Make(X : T, Y : T) : T = {
  var y : bool
  var x : int
  proc f() : int = {
    var i, j : int;
    i <@ X.f();
    j <@ Y.f();
    return i + j + x;
  }
}.

module My = {
  var x : int
  var y : bool
  proc f() : int = {
    if (y) { x <- 1; } else { x <- 2; }
    return x;
  }
}.

module Yours = {
  var x : int
  var y : bool
  proc f() : int = {
    if (y) { x <- 1; } else { x <- 2; }
    return x;
  }
}.

print glob Make(My, Yours).
(*
Globals [# = 0]:

Prog. variables [# = 6]:
  Make.x : int
  Make.y : bool
  My.x : int
  My.y : bool
  Yours.x : int
  Yours.y : bool

note that in lexicographic order, as expected
*)

Is this buggy behavior, or if not, what is the model for why the two file version produces apparently out of order results?

fdupress commented 1 month ago

It looks like the order on path either first separates by length (so a longer long name will always be ordered after a shorter one regardless of the relative ordering of their first elements), or always orders Self (or Top, or "empty") first.

Independently of this explanation is the question of whether this is desired or not. I don't think there is any way for us to do anything that both:

  1. groups variables by (sub)module;
  2. is stable w.r.t. import (and other changes in displayed long names).

Here, 1 is clearly desirable, and 2 slightly less clearly so. But the fact that the order doesn't change just because you import GlobOddity1 is certainly behaviour: I don't want to have to revisit a bunch of indexing just because I decided to bring some symbols into my local namespace.)

So, likely, the best we can do is 1) pick something that makes sense, and 2) document it.

Now the question 1 of whether the current ordering on paths makes sense is good. I think it does if you think "local first". We could debate this, but we could also explain it.

And beyond simply documenting that explanation, one way to perhaps improve this in the tool itself is to always print the name we used in deciding the order, or perhaps print out comments clarifying long names that were used in ordering, but shortened by imports. (This might require allowing comments in the actual AST...)

Globals [# = 0]:

Prog. variables [# = 6]:
  My.x : int
  My.y : bool
  Yours.x : int
  Yours.y : bool
  (* GlobOddity1.Make *)
  Make.x : int
  Make.y : bool
alleystoughton commented 1 month ago

Thanks, François. We're in the position of trying to machine-generate EC code that uses the glob order. All I really care about is being able to predict it. (So for us, the output of print glob isn't much of a help, except in understanding the underlying model.)

strub commented 1 month ago

For the order used, see https://github.com/EasyCrypt/easycrypt/blob/186032b36827273bf02c69b9442612ce7b915712/src/ecPath.ml#L341

strub commented 1 month ago

IMO, the solution is #615

alleystoughton commented 1 month ago

I'm going to reopen this, because the current solution results in a typing anomaly. The reason is that the ordering used for paths treats longer paths as greater, but this isn't stable when multiple files are involved. More deeply, because the type path is left-recursive, it encourages this ordering.

The following example has three files, GlobOddity1.ec, GlobOddity2.ec and GlobOddity3.ec.

Here is GlobOddity1:

module type T = {
  proc f() : unit
}.

module M(X : T) : T = {
  var x : int
  proc f() : unit = {
    X.f();
  }
}.

This checks fine.

Here is GlobOddity2.ec:

require import GlobOddity1.

module N = {
  var x : bool
  proc f() : unit = {
    x <- true;
  }
}.

print glob M(N).
(*
Globals [# = 0]:

Prog. variables [# = 2]:
  N.x : bool  -- really Top.N.x
  M.x : int   -- really Top.GlobOddity1.M.x

the second is longer, so greater, even though M comes before N
*)

(* so this typechecks: *)

op f (g : glob M(N)) : bool = g.`1.

And here is GlobOddity3.ec:

require GlobOddity1 GlobOddity2.

This prints out

Globals [# = 0]:

Prog. variables [# = 2]:
  M.x : int
  N.x : bool

(note that the order has changed!) and then gives the error

In external theory GlobOddity2 [./globOddity2.ec: line 25 (30-34)]:

This expression has type
   int

but is expected to have type
   bool

(line 25 is the definition of f, which no longer typechecks because the order of the glob has changed).

The order changes because now M.x is really Top.GlobOddity1.M.x and N.x is really Top.GlobOddity2.N.x; the path lengths are the same, and so now M comes before N.

You may want to close this saying the solution is records, but I wanted to point out the problem with the current implementation.

strub commented 1 month ago

There is indeed again an instability. The problem is that in Emacs, the name of the current theory is not the same (because we don't have the current filename). Maybe the order of declaration could be used, but this is a more intrusive fix.