github / semantic

Parsing, analyzing, and comparing source code across many languages
8.94k stars 454 forks source link

Explore how to do FileCheck-style queries over Core terms #245

Open patrickt opened 5 years ago

patrickt commented 5 years ago

For the semantic-python test suite, we’ve instituted a FileCheck-esque syntax, allowing us to put jq(1) statements that let us make assertions about the shape of some translated Python code and its heap/scope graphs:

# CHECK-JQ: .tree.contents[0][1].contents[1] | .tag == "Lam" and .contents.value.tag == "If"
# CHECK-JQ: .tree.contents[0][1].contents[1].contents.value | .contents == [[], [], { "tag": "Unit" }]

Unfortunately, as should be apparent, the serialization format for Core trees is not very user-friendly, as it is derived via the Generic instance, and what is natural in Haskell is not necessarily natural in JSON. For example, if we encounter a number of :>>-sequenced statements, I would expect those to be serialized into some sort of array rather than the right-leaning binary tree that it is currently. @robrix mentioned that there’s stuff we can do to make this a little more copacetic; I took a run at it, but fell into a swamp full of type errors.

dcreager commented 5 years ago

I love this idea! I think a good place to start would be a document describing (and committing to) what the JSON rendering would look like. I'd even go further and say that document should be the canonical definition of the Core language, and would (eventually) include the user-facing descriptions of what each Core team means and how it can be used.

patrickt commented 4 years ago

The problem with tying a JSON representation to Core itself is that the convenient JSON representations of Core are lossy. For example, the easy thing to do for a function body is represent it as a list of statements, but doing so loses the scope information associated with the :>>= operator. I can see situations where we’d need both kinds of information. I’m not sure yet how to solve this; perhaps once the builds are extinguished @robrix and I can take some runs at improving it.

robrix commented 4 years ago

Agreed, the ideal representations for testing & interchange may be in tension. I’m honestly not sure.