Semantics.g4 grammar update

katrinafyi commented 7 months ago

We have plans to update the format of the semantics output used by aslp (and hence gtirb-semantics). The goal is to avoid redundancies and eliminate parsing ambiguities.

The PR to aslp is https://github.com/UQ-PAC/aslp/pull/53. This is not merged yet since I did not want to disturb Basil unexpectedly. Some notable features of the updated format are,

Integer and bit literals are now displayed (respectively) as the number itself and as a single-quoted string.
Identifiers are now double-quoted everywhere.
Redundant parentheses around single expressions have been eliminated as much as possible.
With the new format much more uniform, I have hopes that the format will be easier to pretty-print.
Now, there is also an ANTLR grammar within the aslp repo which we unit-test against. However, for the changes here I found it easier to edit Basil's existing Semantics.g4. I plan to copy some of your changes into Aslp's copy.

Rather unfortunately, cntlm is not processing after these changes (but I'm not sure if it did before). The semantics parse correctly but there is an error within translating.GTIRBToIR.handleIfStatement(GTIRBToIR.scala:421) due to an intrusive list assertion. I believe I have combed over the code and eliminated discrepancies, so it might be some other interaction. If desired, you can test the new format with this cntlm gts file new.gts.tar.gz.

Edi: To inspect the semantics within the .gts, you can use a pipeline of

proto-json.py | jq .modules[0].aux_data.ast.data -r | base64 -d | jq

or something like that.

Edit 2: For more examples of the new format, see https://github.com/UQ-PAC/aslp/blob/27ff52894304fb0f777a60d049c61fc0b328b014/tests/aslt/test_dis.t

katrinafyi commented 7 months ago

@l-kent fyi

l-kent commented 7 months ago

Why have you created a separate, less complete version of the grammar for your own unit-testing? Would it not just make sense to copy the one here and work from that as a starting point?

I'll investigate the cnltm issue. It may be that changes to IntrusiveList in the recent merge broke the ASLp parser unexpectedly for more complex cases.

katrinafyi commented 7 months ago

Is the ASLp copy less complete? If you have an example, I will have to fix this. They should definitely be equivalent in the syntax they parse.

As for why I did not promote the Basil grammar, that is just my own bias. Some of the Basil grammar's constructs are somewhat more difficult to handle in C++. But the grammars should be converged soon.

Thanks for looking into cntlm! That issue is entirely beyond me.

Edit: I could test the Basil grammar as well, if it is more convenient to have different grammars for different codegen languages.

l-kent commented 7 months ago

Which constructs were a problem for C++? It may be possible to solve those issues so the grammars can be unified.

The biggest thing in terms of completeness is the handling of slices - you have completely separated the two different types of slices (iirc there is also a third but it hasn't been encountered in the ASLp output at any point) in the grammar, and made it so that Expr_Slices only contains a single slice - ASL seems to allow for there to be multiple slices in an Expr_Slices though I haven't encountered that and am unsure of the semantics.

l-kent commented 7 months ago

cntlm breaking was not caused by this, it was indeed previous changes to the IntrusiveList assertions.

katrinafyi commented 7 months ago

I see. Unfortunately, these are not examples of incompleteness in the Aslp grammar. Rather, the Basil grammar is too general.

However, I can understand your confusion. The output of Aslp is significantly reduced from the original ASL syntax in ways that might not be obvious. My intention with endorsing a grammar in Aslp is to document these guarantees as precisely as possible, so downstream users (i.e. Basil) may have certainty in the expected format.

There might be things we have overlooked. However, in formalising the expected format, anything that deviates from the grammar in Aslp's repo is a bug in Aslp and should be reported (as you did earlier with floating points). Over time, I hope to make the grammar more precise (in particular with Type_Constructor).

As for constructs in Basil's version, the treatment of lists (exprs, stmts) is inconvenient. The optionality should be pushed down into the list rules (instead of at all usages of the rules) to avoid law of Demeter violations.

Message ID: @.*** com>

l-kent commented 7 months ago

In that case, shouldn't you remove Stmt_Throw, and Type_Register from the grammar completely? When does Type_Constructor(boolean) ever appear? Is it possible to define the grammar further by doing things like providing a specified list of function names that can appear, narrowing down the possibilities for Type_Constructor's field, etc.?

How extensively have you tested the coverage of this?

I am not interested in modifying the BASIL grammar to try to unnecessarily maintain the law of Demeter while parsing these common ASLp list constructs, it is definitely much easier in Scala to parse things this way.

l-kent commented 7 months ago

You may even be able to go as far as describing the expected parameter types for each function, and things like Slice_LoWd's parameters always being integers, Expr_Array always having an Expr_Var as its first parameter and an integer as its second parameter, Expr_Field always having an Expr_Var as its first parameter and a string as its second parameter, and many other cases.

The more precision in the output format, the better.

katrinafyi commented 7 months ago

Many things are possible now that we have this infrastructure in place. "The more precision in the output format, the better." - I'm glad you agree.

We will work on refining the grammar as time allows. As you can imagine, this will take some time and we would like to avoid overcomplicating the codegen with many more rules than necessary.

I, for one, will not be working on this further until (at least) next week. If you want, you are welcome to commit to this branch.

How extensively have you tested the coverage of this?

We test with every instruction in cntlm. You can see the output here (note, 11MB file). This is a good starting point for whether particular constructs actually appear in the output.

I am not interesting in [...]

I think I did not explain the problem clearly. Consider, exprs represents one-or-more expressions and this is turned into zero-or-more by ?. In Scala, this leads to the empty case appearing as a null (!) instead of an empty list. It is much more ergonomic to move the optionality and the brackets into exprs, like

exprs: OPEN_BRACKET (expr (SCOLON expr)*)? CLOSE_BRACKET;

which more accurately represents an expression list (which is always possibly empty). The law of Demeter was the wrong way to describe the problem. As an example, the previous code of

val elseStmts = Option(ctx.elseStmt) match { // note the null check in Option() constructor
      case Some(_) => ctx.elseStmt.stmt().asScala.flatMap(visitStmt(_, label))
      case None => mutable.Buffer()
}

could become

var elseStmts = ctx.elseStmt.stmt.asScala.map(visitStmt(_, label))

I hope you can see what I mean now.

l-kent commented 7 months ago

I understand the issue you had with the grammar now, and that is definitely fixable. It was not at all clear from your previous description or your own grammar that that was the issue.

Testing with every instruction in cntlm is too limited and not particularly useful, since we have already worked through all the instructions in it. If you are trying to make guarantees about the ASLp output format, you will need to test much more extensively to identify unexpected cases and handle them. It doesn't even look like you have the floating point instructions from cntlm in there?

I don't know if it's safe to say that an Expr_Slices containing multiple slices will never occur in the ASLp output since we haven't actually ruled out it appearing, nor have we ruled out that it would make sense to leave it as-is if it does appear - that's very hard to do unless the case is actually is encountered.

katrinafyi commented 7 months ago

It doesn't even look like you have the floating point instructions from cntlm in there?

Thanks for pointing this out. I had pulled the instructions from cntlm-new.adt which was missing these. I believe this is well out of date.

we haven't actually ruled out it appearing

Not yet, but it could be done from within Aslp. I'll add this to the list.

Testing with every instruction in cntlm is too limited and not particularly useful.

I contest that it's not useful, and I do not appreciate your general dismissiveness towards my work. Afaik, cntlm is currently the largest example which Basil targets. For the purposes of making Aslp as perfect as possible it is certainly insufficient, but it is meaningful for the Basil project and useful as a first step. Surely any testing is better than no testing. Floating-point will be added.

l-kent commented 7 months ago

I mean that it's not particularly useful for trying to broadly guarantee anything about the ASLp output (because that is what I was interested in here), since you're just checking the subset of instructions that we already know do not do anything unusual or have handled unusual cases. It would obviously still be useful for detecting regressions and for testing the changes made here.

l-kent commented 7 months ago

I'm not trying to be dismissive, so my apologies, I'm just trying to get a handle on the state of the guarantees that are being provided. I think trying to provide guarantees about the output is excellent and extremely helpful and I'm looking forward to this being further refined.

katrinafyi commented 7 months ago

Thanks for the apology. I can understand the aspiration to make Aslp as robust and reliable as possible - we definitely share this goal. Your suggestions are reasonable and they are all things which we would like to have. We just cannot deliver everything at once, progress is achieved over time :)

l-kent commented 7 months ago

Can you provide a .gtirb file for the cntlm binary you used? I got the cntlm .gts file you provided to go through, after applying an number of temporary fixes to issues that previously existed, including allowing Type_Constructor(integer) to be parsed (though I would like to avoid it as mentioned here), but the resulting .bpl output is 3GB, which is obviously a problem. This would help me to narrow down the issue so I can see if it is related to this patch or something about the binary.

katrinafyi commented 7 months ago

This includes the cntlm binary and original gtirb file: cntlm-2024-03-15.tar.gz. It was built with aarch64-suse-linux-gcc from https://github.com/UQ-PAC/cntlm. I can't tell if anything is out of order, but there are a large number of symbols starting with duk - a Javascript engine of some kind?

l-kent commented 7 months ago

Thanks!

The duk symbols are all the functions in the duktape Javascript engine, which is used within cntlm. They're expected to be there. I don't think we will be able to get very far actually verifying any of those though.

l-kent commented 7 months ago

The issue is definitely a bug related to this binary in particular and not this patch.

katrinafyi commented 7 months ago

Interesting, do you have any guesses to why? Is there a recommended way to compile cntlm for analysis? Though, that is good news for this PR (but bad news for my compiler).

l-kent commented 7 months ago

There appears to be some sort of bug with the memory consolidation code for that binary: #201 but it needs further investigation

l-kent commented 7 months ago

I think it's fine if you merge the ASLp changes, they seem fine, but that should be merged first so I can update the tests here.

katrinafyi commented 7 months ago

Thanks! I will do this first thing next week (Friday afternoons are bad luck).

After the Aslp PR is merged, the Nix packages will have to be rebuilt as well to propagate changes properly. I'll let you know once it's all ready to use.

l-kent commented 7 months ago

Sounds good!

katrinafyi commented 7 months ago

@l-kent This should be ready now. Update with

nix profile upgrade '.*'

and check the installed gtirb-semantics's dependency versions with

nix path-info $(which gtirb-semantics) --json | jq

This second command should include a reference to ocaml4.14.1-asli-unstable-2024-03-18 or later. Good luck!

ailrst commented 4 months ago

updated the gts files: should drop this commit first if there are any other source changes

With the changes here https://github.com/UQ-PAC/gtirb-semantics/pull/12 and the above commits it is possible to re-lift all the gts files in place without recompiling the examples. The gtirb loader seems not to produce syntax/parseing errors now, but there are a lot of other crashes with bool2bv conversion, and control flow (possibly the blr issue). Will have to wait for nix weekly(?) rebuild or build from source and dune install (and redefine GTIRB_SEMANTICS ?= gtirb_semantics in lift-directories.mk) for re-lifting to work.

Test with

cd src/test && make reliftgts -j10 && cd ../..
./mill test.testOnly '*SystemTests*' -- -z GTIRB

l-kent commented 4 months ago

I don't think we want to just work from the existing .gts files in this case, since there are changes to DDisasm too. I'll re-lift all the tests myself, I have the binaries still.

l-kent commented 4 months ago

By the 'control flow' crashes do you mean the IntrusiveList assertion errors? Those should be fixed by #208.

l-kent commented 4 months ago

Can we put moving the indirect call resolution tests into their own folder into a separate pull request? That requires more consideration and really should be checking that the indirect calls are resolved correctly, rather than just that verification succeeds.

l-kent commented 4 months ago

I looked at one of the cvt_bool_bv.0 cases that aren't handled properly now (it looks trivial to fix), and it seems to demonstrate a semantic change in the lifter - is this intentional @katrinafyi?

old:

"subs w8, w8, #0": [
  "Stmt_ConstDecl(Type_Bits(Expr_LitInt(\"32\")),Cse0__5,Expr_TApply(add_bits.0,[(Expr_LitInt(\"32\"))],[(Expr_Slices(Expr_Array(Expr_Var(_R),Expr_LitInt(\"8\")),[(Slice_LoWd(Expr_LitInt(\"0\"),Expr_LitInt(\"32\")))]));(Expr_LitBits(\"11111111111111111111111111111111\"))]))",
  "Stmt_Assign(LExpr_Field(LExpr_Var(PSTATE),V),Expr_TApply(not_bits.0,[(Expr_LitInt(\"1\"))],[(Expr_TApply(cvt_bool_bv.0,[],[(Expr_TApply(eq_bits.0,[(Expr_LitInt(\"33\"))],[(Expr_TApply(SignExtend.0,[(Expr_LitInt(\"32\"));(Expr_LitInt(\"33\"))],[(Expr_TApply(add_bits.0,[(Expr_LitInt(\"32\"))],[(Expr_Var(Cse0__5));(Expr_LitBits(\"00000000000000000000000000000001\"))]));(Expr_LitInt(\"33\"))]));(Expr_TApply(add_bits.0,[(Expr_LitInt(\"33\"))],[(Expr_TApply(add_bits.0,[(Expr_LitInt(\"33\"))],[(Expr_TApply(SignExtend.0,[(Expr_LitInt(\"32\"));(Expr_LitInt(\"33\"))],[(Expr_Slices(Expr_Array(Expr_Var(_R),Expr_LitInt(\"8\")),[(Slice_LoWd(Expr_LitInt(\"0\"),Expr_LitInt(\"32\")))]));(Expr_LitInt(\"33\"))]));(Expr_LitBits(\"111111111111111111111111111111111\"))]));(Expr_LitBits(\"000000000000000000000000000000001\"))]))]))]))]))",
  "Stmt_Assign(LExpr_Field(LExpr_Var(PSTATE),C),Expr_TApply(not_bits.0,[(Expr_LitInt(\"1\"))],[(Expr_TApply(cvt_bool_bv.0,[],[(Expr_TApply(eq_bits.0,[(Expr_LitInt(\"33\"))],[(Expr_TApply(ZeroExtend.0,[(Expr_LitInt(\"32\"));(Expr_LitInt(\"33\"))],[(Expr_TApply(add_bits.0,[(Expr_LitInt(\"32\"))],[(Expr_Var(Cse0__5));(Expr_LitBits(\"00000000000000000000000000000001\"))]));(Expr_LitInt(\"33\"))]));(Expr_TApply(add_bits.0,[(Expr_LitInt(\"33\"))],[(Expr_TApply(add_bits.0,[(Expr_LitInt(\"33\"))],[(Expr_TApply(ZeroExtend.0,[(Expr_LitInt(\"32\"));(Expr_LitInt(\"33\"))],[(Expr_Slices(Expr_Array(Expr_Var(_R),Expr_LitInt(\"8\")),[(Slice_LoWd(Expr_LitInt(\"0\"),Expr_LitInt(\"32\")))]));(Expr_LitInt(\"33\"))]));(Expr_LitBits(\"011111111111111111111111111111111\"))]));(Expr_LitBits(\"000000000000000000000000000000001\"))]))]))]))]))",
  "Stmt_Assign(LExpr_Field(LExpr_Var(PSTATE),Z),Expr_TApply(cvt_bool_bv.0,[],[(Expr_TApply(eq_bits.0,[(Expr_LitInt(\"32\"))],[(Expr_TApply(add_bits.0,[(Expr_LitInt(\"32\"))],[(Expr_Var(Cse0__5));(Expr_LitBits(\"00000000000000000000000000000001\"))]));(Expr_LitBits(\"00000000000000000000000000000000\"))]))]))",
  "Stmt_Assign(LExpr_Field(LExpr_Var(PSTATE),N),Expr_Slices(Expr_TApply(add_bits.0,[(Expr_LitInt(\"32\"))],[(Expr_Var(Cse0__5));(Expr_LitBits(\"00000000000000000000000000000001\"))]),[(Slice_LoWd(Expr_LitInt(\"31\"),Expr_LitInt(\"1\")))]))",
  "Stmt_Assign(LExpr_Array(LExpr_Var(_R),Expr_LitInt(\"8\")),Expr_TApply(ZeroExtend.0,[(Expr_LitInt(\"32\"));(Expr_LitInt(\"64\"))],[(Expr_TApply(add_bits.0,[(Expr_LitInt(\"32\"))],[(Expr_Var(Cse0__5));(Expr_LitBits(\"00000000000000000000000000000001\"))]));(Expr_LitInt(\"64\"))]))"
]

new:

"subs w8, w8, #0": [
  "Stmt_ConstDecl(Type_Bits(32),\"Cse0__5\",Expr_TApply(\"add_bits.0\",[32],[Expr_Slices(Expr_Array(Expr_Var(\"_R\"),8),[Slice_LoWd(0,32)]);'00000000000000000000000000000000']))",
  "Stmt_Assign(LExpr_Field(LExpr_Var(\"PSTATE\"),\"V\"),Expr_TApply(\"not_bits.0\",[1],[Expr_TApply(\"cvt_bool_bv.0\",[],[Expr_TApply(\"eq_bits.0\",[32],[Expr_Var(\"Cse0__5\");Expr_Var(\"Cse0__5\")])])]))",
  "Stmt_Assign(LExpr_Field(LExpr_Var(\"PSTATE\"),\"C\"),Expr_TApply(\"not_bits.0\",[1],[Expr_TApply(\"cvt_bool_bv.0\",[],[Expr_Var(\"FALSE\")])]))",
  "Stmt_Assign(LExpr_Field(LExpr_Var(\"PSTATE\"),\"Z\"),Expr_TApply(\"cvt_bool_bv.0\",[],[Expr_TApply(\"eq_bits.0\",[32],[Expr_Var(\"Cse0__5\");'00000000000000000000000000000000'])]))",
  "Stmt_Assign(LExpr_Field(LExpr_Var(\"PSTATE\"),\"N\"),Expr_Slices(Expr_Var(\"Cse0__5\"),[Slice_LoWd(31,1)]))",
  "Stmt_Assign(LExpr_Array(LExpr_Var(\"_R\"),8),Expr_TApply(\"ZeroExtend.0\",[32;64],[Expr_Var(\"Cse0__5\");64]))"
]

ailrst commented 4 months ago

there are changes to DDisasm too

Yes, they aren't supported in basil yet either so I didn't bother, but its a good idea to do the rebuild and lift once.

Can we put moving the indirect call resolution tests into their own folder into a separate pull request? That requires more consideration and really should be checking that the indirect calls are resolved correctly, rather than just that verification succeeds.

yes we at least need another pr that fully separates out the examples, I'll fix that commit after you update the gts files since it will require a rebase.

ailrst commented 4 months ago

I looked at one of the cvt_bool_bv.0 cases that aren't handled properly now (it looks trivial to fix), and it seems to demonstrate a semantic change in the lifter - is this intentional

Kait is on leave currently, there have been semantic changes to the lifter in the past three months, but I don't immediately know what caused these specific changes. This lifter is still correct wrt its differential testing against the interpreter, so they should be semantically equivalent.

l-kent commented 4 months ago

What I was most concerned about the information flow loss for the C flag, but this doesn't occur in the files I've just lifted myself. This suggests that there was a problem somewhere in the process of re-lifting the .gts file from an older .gts file.

l-kent commented 4 months ago

The updated gts files are in the grammar-update2 branch. I'm going to fix some issues relating to the fix for the blr instruction in DDisasm in that branch.

katrinafyi commented 4 months ago

is this intentional @ katrinafyi? subs w8, w8, #0 [...]

I haven't been involved in recent internal changes which might've led to this, but I don't believe it's incorrect (as Alistair says, this is also validated by differential testing). Recently, we have been improving the simplification rules from a new direction, and I think this might be a consequence of that.

The instruction is w8 = w8 - 0 which we know will never ~carry or~ overflow (note the V computation is simplifiable to False as well).

~I am concerned about you obtaining different results at different times, however...~ [my misunderstanding]

Edit: It seems that C currently simplifies to constant True. With my copy of asli-unstable-2024-05-08, I see something in between your new and old semantics. Looking at the pseudocode, ~I'm not convinced there's not a bug in aslp here. One would think both C and V are constant False when subtracting zero?~ [see edit 2]

$ echo ':ast A64 0x71000108' |  aslp
Stmt_ConstDecl(Type_Bits(32),"Cse0__5",Expr_TApply("add_bits.0",[32],[Expr_Slices(Expr_Array(Expr_Var("_R"),8),[Slice_LoWd(0,32)]);'00000000000000000000000000000000']))
Stmt_Assign(LExpr_Field(LExpr_Var("PSTATE"),"V"),Expr_TApply("not_bits.0",[1],[Expr_TApply("cvt_bool_bv.0",[],[Expr_TApply("eq_bits.0",[32],[Expr_Var("Cse0__5");Expr_Var("Cse0__5")])])]))
Stmt_Assign(LExpr_Field(LExpr_Var("PSTATE"),"C"),Expr_TApply("not_bits.0",[1],[Expr_TApply("cvt_bool_bv.0",[],[Expr_TApply("eq_bits.0",[33],[Expr_TApply("ZeroExtend.0",[32;33],[Expr_Var("Cse0__5");33]);Expr_TApply("add_bits.0",[33],[Expr_TApply("ZeroExtend.0",[32;33],[Expr_Slices(Expr_Array(Expr_Var("_R"),8),[Slice_LoWd(0,32)]);33]);'100000000000000000000000000000000'])])])]))
Stmt_Assign(LExpr_Field(LExpr_Var("PSTATE"),"Z"),Expr_TApply("cvt_bool_bv.0",[],[Expr_TApply("eq_bits.0",[32],[Expr_Var("Cse0__5");'00000000000000000000000000000000'])]))
Stmt_Assign(LExpr_Field(LExpr_Var("PSTATE"),"N"),Expr_Slices(Expr_Var("Cse0__5"),[Slice_LoWd(31,1)]))
Stmt_Assign(LExpr_Array(LExpr_Var("_R"),8),Expr_TApply("ZeroExtend.0",[32;64],[Expr_Var("Cse0__5");64]))

Edit 2: With the understanding that subs calls AddWithCarry with y = Ones(64) and carry_in = 1, the result of V = False and C = True now makes sense. This is only from following the architecture spec, I have no intuition for this.

l-kent commented 4 months ago

I don't see an obvious cause for the different lifting results other than @ailrst using there newly added .gts to .gts re-lifting process, it doesn't seem like there are other recent changes to the gtirb-semantics repo or aslp repo that wouldn't be in nix yet?

What you've gotten is the same result I did when lifting (I have the same build of aslp, which is the latest available on nix) - the only difference from what @ailrst's lifting produced is just the C flag simplification.

ailrst commented 4 months ago

I was using a local build which is on another branch of aslp, which should only have changes to the offline lifter but there may be other differences, I will check again (and with prev. version of gtirb_semantics) when I get time. But it seems obvious your and kaits version is the more correct so go a head and upload the relifted files.

l-kent commented 4 months ago

I'm done with my changes in the grammar-update2 branch so there aren't any errors in the parsing & translation now (except those that will be resolved by #208 - I've tested this). @ailrst You can force push that branch to this, but that would remove your changes so I haven't done that yet.

ailrst commented 4 months ago

Will add that change and force push

l-kent commented 4 months ago

This should be good to merge now? @ailrst

ailrst commented 4 months ago

Yes tests are now only failing due to boogie verification failures

UQ-PAC / BASIL

Semantics.g4 grammar update #197