Maintaining grounding in large terms during proof development is difficult

samcowger commented 1 year ago

Here's an example workflow to illustrate this issue. Suppose I have some C and wish to verify its equivalence to a Cryptol specification. After compiling it to LLVM and lifting that LLVM into the SAW/Cryptol space, and while exploring the goal at the REPL, I realize my proof requires a precondition. I include the precondition in the proof script, reload SAW, and return to my breakpoint. Now that the goal term is different - in particular, whatever memoization variables were previously assigned to terms have almost certainly changed - I have to spend time regaining the context I had built up in the prior version of this term, consulting the memoization table as needed. As the term grows in size, so does the memo table, which means more scrolling and more context to gain and maintain.

One possible improvement to a user's experience here would be to assign memoization bindings based on term structure, so that e.g. Vec 8 Bool, when memoized, would always be bound to the same identifier. An identifier based on the shape of a term sounds a lot like hashing to me, so I think one way to effect this would to compute Term hashes deterministically based on their shape/structure. This differs from the current Term hashing behavior, which depends on the order in which all Terms were constructed in a particular invocation of SAW (stAppIndex represents that ordering, as I understand it). Term-printing could then leverage this hash when generating memoization bindings.

I've opened #1830 to shift Term hashing to be structural in this way, by changing the Hashable instance of Term.

Any thoughts/comments/concerns are appreciated!

RyanGlScott commented 1 year ago

Yes, I definitely share your pain about the memoized output being difficult to reinterpret after changes. There is an option to configure how much memoization is used (see #1330), but that doesn't ultimately address the underlying issue.

One possible thing to watch out for: I am not sure if the Hashable instance for Terms would include anything to the effect of generated unique numbers (e.g., for local variables). If so, this could cause the hash of a term to change if the order in which the uniques are generated changes. A quick search through the SAW codebase doesn't turn up anything like this, but similar issues have caused issues in GHC in the past.

samcowger commented 1 year ago

I'm unfamiliar with how/where SAW constructs local variables, except for its use of (non-unique) DeBruijn indices for bound local variables. Do you have a particular stanza of uniqueness-generation code in mind, or are you saying a codebase search turned up no such stanza? Also, iff you have a reference to one easily at hand, could you point me to an instance of this causing problems in GHC?

RyanGlScott commented 1 year ago

Do you have a particular stanza of uniqueness-generation code in mind, or are you saying a codebase search turned up no such stanza?

Mostly, I am worried about code that looks like this:

uniqCounter :: IO (IORef Int)
uniqCounter = newIORef 0

newUnique :: IO Int
newUnique = do
  modifyIORef (+1) uniqCounter
  readIORef uniqCounter

There is one occurrence of this pattern here, but as far as I can tell, this is only used to power stAppIndex. Your PR in #1830 explicitly avoids using stAppIndex in the Hashable instance for Term, which avoids that possible issue.

Also, iff you have a reference to one easily at hand, could you point me to an instance of this causing problems in GHC?

There is a GHC mega-issue that tracks sources of nondeterminism at the interface file level here. Almost all of these were caused by GHC's unique numbers leaking into the interface file format in some way, and lots of engineering had to be put into place to make the output of interface files insensitive to the order in which unique numbers were generated.

msaaltink commented 1 year ago

A related problem was discussed briefly some time ago: https://github.com/GaloisInc/saw-script/issues/1200#issuecomment-831434798 but I don't know if @robdockins made any headway on it. The idea was to have whatever memoization variables are introduced persist for the duration of a proof, so that as you take steps you retain the context as described above.

samcowger commented 1 year ago

Mostly, I am worried about code that looks like this

I see. Is the pattern used to initialize scFreshGlobalVar in Verifier.SAW.SharedTerm.mkSharedContext similar enough to potentially cause this issue, do you think? I have a small proof-of-concept that implements the workflow I described above, in which Term hashes do persist appropriately, but perhaps I ought to make sure I'm triggering local variable creation somehow.

samcowger commented 1 year ago

A related problem was discussed briefly some time ago: #1200 (comment) but I don't know if @robdockins made any headway on it. The idea was to have whatever memoization variables are introduced persist for the duration of a proof, so that as you take steps you retain the context as described above.

Ah, thanks for the reference, I hadn't seen that issue! To implement that sort of "intra-proof" memoization variable determinism, I believe it would suffice to make memoization variables be/involve stAppIndex. Maybe that's a change we should also consider, though ideally implementing "inter-proof" memoization variable determinism should give us the former for free.

RyanGlScott commented 1 year ago

Is the pattern used to initialize scFreshGlobalVar in Verifier.SAW.SharedTerm.mkSharedContext similar enough to potentially cause this issue, do you think?

I think it does have the potential to cause problems, at the very least. The real question is if the value of these scFreshGlobalVars can be observed by the user in some way. If we design things well, then this shouldn't be the case, but I haven't looked deeply enough at scFreshGlobalVars use sites to say for sure.

To help diagnose these sorts of issues better, GHC introduced a -dunique-increment=<INTEGER> flag that lets you control how much is added to the global unique counter each time it is changed. For instance, you can run a program twice, once normally and once with the unique numbers generated in reverse order (using -dunique-increment=-1), and if the behavior is the same in both cases, then you have increased assurance that your unique numbers aren't leaking out.

eddywestbrook commented 1 year ago

Sam's PR #1830 addresses some of this, by making the hash function for the Hashable instance only depend on the structure of the term, not on any of its stAppIndexs. This should generate a stable code for a Term that does not change between different runs of SAW. Just to be clear, "different runs of SAW" for me = you run a SAW proof script, it gets stuck at some point, and you go back and change something and run it again.

Unfortunately, the one place this will not work is for fresh ExtCnss, i.e., those returned by scFreshEC, since the whole point of scFreshEC is to build fresh un-named atoms whose only differentiating factor is that they have different stAppIndexs. These are generated by automated FM procedures. For instance, they are often used to instantiate local (i.e., deBruijn) variables in a universal formula before passing that formula to an SMT solver. When you think about it, though, it doesn't really make sense to have a stable way to refer to something like "the nth fresh auto-generated constant" if you are changing the automation that runs and thus what constants it generates.

GaloisInc / saw-script

Maintaining grounding in large terms during proof development is difficult #1831