GaloisInc / saw-script

The SAW scripting language.
BSD 3-Clause "New" or "Revised" License
437 stars 63 forks source link

Python bindings: Make Cryptol quasiquotation easier #1188

Open RyanGlScott opened 3 years ago

RyanGlScott commented 3 years ago

In SAWScript, you can embed Cryptol expressions inside of other Cryptol expressions automatically, e.g.,

let a = {{ g x }};
let b = {{{ foo a }};

However, the equivalent of this in Python is not as ergonomic:

a = "g x";
b = f"foo ({a})";

Ideally, both a and b would be Cryptol expressions, and moreover, you wouldn't need to explicitly add parentheses around {a}.

@pnwamk has this to say about the idea:

At present, I think we likely need a slightly more sophisticated mechanism for quasiquotation-style construction of cryptol terms.

E.g., after the pattern of this SO post, perhaps we have a function cry which takes a string literal with f-string style holes for splicing in content (i.e., 'blah blah {CONTENT}blah blah'), and that cry function automatically formats the nested code chunks as necessary, adds parens around nested terms, etc.

I.e., perhaps we write:

a = cry('g x')
b = cry('foo {a}')

And behind the scenes the a in the curly braces gets some helper function to_nested_cryptol_term or something applied to it and we end up with "foo (g x)" automatically.

This issue aims to track this idea.

m-yac commented 3 years ago

I wonder if we could get away with something simpler than the linked solution – I'm somewhat put off by the fact that it uses eval. The original SO post that the linked one references seems to agree that it's a bit sketchy.

If we're okay with only allowing expressions like ident, ident[0], ident.attr inside brackets (the same rules as str.format()) then we could do:

from string import Formatter
import sys

class CryptolFormatter(Formatter):
    def format_field(self, value, spec):
        return f"({super().format_field(value, spec)})"

def cry(str):
    previous_frame = sys._getframe(1)
    locals_and_globals = {**previous_frame.f_locals, **globals()}
    return CryptolFormatter().format(str, **locals_and_globals)

a = cry('g x')
b = cry('foo {a}')
print(repr(a)) # prints 'g x'
print(repr(b)) # prints 'foo (g x)'

And of course, we could have CryptolFormatter be a bit smarter.

pnwamk commented 3 years ago

We should probably come up with a few potential solutions, document their pros and cons, and then decide.

W.r.t. the two quasiquoting solutions so far, I personally prefer the one that allows arbitrary expressions in the escape holes. Unless I'm missing something, the use of eval in the more complex approach seems rather benign since its sole intent for us would essentially be to provide a custom meaning for python string literals that appear as a part of the source of the program.

Another approach might be to just embrace Python strings themselves as our canonical quasiquotation technique for Cryptol syntax. At that point, we could use plain old f-strings to combine expressions. As long as every Cryptol expression and value has a proper __str__ method that can reliably turn them into the appropriate syntax, you could then just place those kinds of values directly in the f-string holes and it would Just Work(TM).

m-yac commented 2 years ago

I was able to figure out a solution using reflection in GaloisInc/cryptol#1307 that replicates f-string syntax exactly.

I never actually posted about this here, but I found that the solution linked in the initial post suffers from an issue with nested brackets, e.g. when quoting a dict: '... { {"x":1} } ...'. This is because the str.format parser (which is what the linked solution uses under the hood) is really only meant to be given an ident, ident[i], or ident.attr. The reason it works in most cases is because it considers, for example x + y, as a valid ident – I'm guessing because it doesn't actually check any characters besides {}[].. Since it does check brackets though, the string I gave above doesn't work.

RyanGlScott commented 2 years ago

Now that #1550 has landed, the SAW Python client offers a cry_f offers a much more natural quasiquotation mechanism that doesn't have nearly as many quirks of cry. In light of this, I wonder if there is anything left to do on this issue or if we should just close this.