UQ-PAC / BASIL

Apache License 2.0
8 stars 0 forks source link

Uninterpreted functions #209

Closed l-kent closed 3 months ago

l-kent commented 4 months ago

Adds some basic support for ASLp's internal floating point functions by modelling them as uninterpreted functions. This probably has limited utility for now.

ailrst commented 3 months ago

We should add a minimal test for this to make sure nothing breaks https://github.com/UQ-PAC/bil-to-boogie-translator/issues/211

l-kent commented 3 months ago

Added a test but #197 should be merged first

ailrst commented 3 months ago

What is the reason translating expr TApply requires explicitly matching each function, is there a change we can make to the ASLT to be able to translate an arbitrary function call? Is it just the type signature?

l-kent commented 3 months ago

We need to know the types of the parameters and return value. I do not think trying to translate arbitrary calls is a useful or worthwhile approach, as we are never actually dealing with arbitrary calls.

l-kent commented 3 months ago

If you wanted to have some sort of named no-op that BASIL then handles as an uninterpreted function, that should probably be a completely different thing in the ASLT.

ailrst commented 3 months ago

For behaviour we don't yet, or wont, support, havoc would be the semantics that makes sense rather than nop. Since this is the standard over-approximation of an unkonwn procedure call we typically model that in ASL as a procedure call as well. I'm referring to exception stuff and barriers, which have some side-effect on the machine state that we don't model. Having basil blindly translate it would at least leave open the possibility of shoving in a semantics in at the boogie level later.

l-kent commented 3 months ago

Boogie functions can't have side effects by definition, so blindly translating those ASLp calls as uninterpreted functions wouldn't make any sense anyway and would still just be shifting the point of translation to later on in BASIL.

It's also not really possible to translate an arbitrary ASLp method call - some have input types (such as floating point values) that can't be directly translated.

I had a look at the list of functions that aren't inlined by ASLp again and found a few more that seem possible to handle. Are there any others that aren't in the following list that we would ever expect to encounter?

https://github.com/UQ-PAC/aslp/blob/abd63cb88c81f34822dd3d0779517430d0aa9ba8/libASL/dis.ml#L71-L108

ailrst commented 3 months ago

It's also not really possible to translate an arbitrary ASLp method call - some have input types (such as floating point values) that can't be directly translated.

We can still attempt to handle the general case, and regardless those types we can't represent shouldn't appear in the residual programs to my knowledge.

As far as I know only bitvectors are dependently typed in ASL and the syntax should indicate which arguments are type arguments vs value arguments? For the FP type signatures, it should be syntactically obvious if a literal is an integer or a bitvector. Also how do we know those are literals and not other expressions? https://github.com/UQ-PAC/bil-to-boogie-translator/pull/209/files#diff-df00a0d1b76fb8fde2e5d15596229573912722bd9408698877052da849268e75R372

Regardless, that list is all we really need to handle so it shouldn't matter. I suspect some of those we can't handle will appear as procedure call statements rather than apply expressions.

I would just rather over-approximate than crash when we get input that doesn't work.

l-kent commented 3 months ago

those types we can't represent shouldn't appear in the residual programs to my knowledge.

FPRoundBase and BFRound, which are in that list in dis.ml, are both currently not handled due to taking 'real'-type input parameters. I don't know if they're possible for any instruction to produce because that's something that's quite difficult to check.

For the FP type signatures, it should be syntactically obvious if a literal is an integer or a bitvector.

This is not the main issue - we need to know the type of the return value. Needing to single out integer parameters is just due to a quirk of the parsing and can be cleaned up.

Also how do we know those are literals and not other expressions?

It seems like it should always be a static value. It is of integer type so if there were somehow an integer expression as the parameter that would cause further problems.

I suspect some of those we can't handle will appear as procedure call statements rather than apply expressions.

Do you have an example of any of them? None of the ones I've added support for should appear as call statements - the only call statement we currently encounter is Mem.set. Theoretically AArch64.MemTag.set would be one but my understanding is the memory tag instructions are unsupported by ASLp for other reasons? AtomicStart and AtomicEnd probably show up as call statements but I haven't checked recently.

I would just rather over-approximate than crash when we get input that doesn't work.

BASIL doesn't even crash at present when an unidentified call or apply is encountered, it just prints an error message and completely ignores the statement. It is not going to be possible to overapproximate arbitrary input - the best we can reasonably do is ignore input that appears potentially valid but we know we don't handle yet.