UQ-PAC / BASIL

Apache License 2.0
8 stars 0 forks source link

Simplify Expressions / Store / Load #185

Closed ailrst closed 2 months ago

ailrst commented 6 months ago

In the abstract, the language contains three types of statement

This proposal is to more cleanly separate this into statements that update the shared state and those which don't.

  1. Make MemoryLoad and MemoryStore no longer expressions, so Expr only contains variables, literals, and Ops. I.e. cannot reference shared state.
  2. Have 3 types of assign statement
    • LocalAssign ::= LocalVariable := Expr
    • MemoryLoad ::= LocalVariable := MemoryLoad(expr, size)
    • MemoryStore ::= memory := MemoryStore(expr, expr, size)

This maintains the "A MemoryAssign statement never contains a load" invariant in the type system.

Note that a (shared / non-thread-local ) global variable is always accessed through memory stores and loads. As far as I understand the IR's notion of Scope is about boogie's procedure-local/global scope not concurrent sharing. While currently registers are marked global I don't think they should be marked global in the sense the information flow logic describes. Additionally we should have a correct liveness analysis soon so should be able to go back to declaring them locally and passing them through procedure arguments.

l-kent commented 6 months ago

I don't really follow this or the motivation behind it. By 'shared state' are you just referring to memory? Statements that can update memory are already cleanly separated - they are MemoryAssigns (memory stores). Nothing else can update memory.

Yes, Registers are global in terms of scope within the IR and Boogie, but are local in relation to threads which is what matters for the information flow logic. Those are separate concepts and the IR does not conflate them. There isn't really a specific concept of 'global variable' in the IR in terms of threads. There is just non-stack memory (which is shared between threads) and everything else (which is not shared between threads). Global/local for threads does not currently matter for any of the analyses.

Additionally we should have a correct liveness analysis soon so should be able to go back to declaring them locally and passing them through procedure arguments.

What would be the reason to do this? Changing the representation doesn't solve any problems for us.

ailrst commented 6 months ago

By 'shared state' are you just referring to memory?

Yes but I'm also thinking about how we will move to the region representation, hopefully eventually "memory" means a set of global boogie variables, some of which are shared and some of which are not shared.

Statements that can update memory are already cleanly separated - they are MemoryAssigns (memory stores).

They are not cleanly separated because the value the MemoryAssign stores is an Expr, which is allowed to be a "MemoryLoad" expression. The fact this is malformed is not enforced by the type system, which is what this issue is about.

Furthermore, statements that read shared memory are not cleanly separated; a LocalAssign can either be a load or a purely local computation and store.

What would be the reason to do this? Changing the representation doesn't solve any problems for us.

I'm not proposing changing their representation in the IR per se, but it seems likely it would be easier for boogie to reason about locals than globals, I think I remember something about the cntlm example not terminating with R21 included? The cleanest way to achieve this might involve changing the IR representation.

l-kent commented 6 months ago

hopefully eventually "memory" means a set of global boogie variables, some of which are shared and some of which are not shared.

Yes, that's what the Memory case class is for. Currently we just have Memory variables named "stack" and "mem", with "stack" being specifically treated as non-shared, but to better support multiple non-shared memories additional parameters or subclasses can be created.

They are not cleanly separated because the value the MemoryAssign stores is an Expr, which is allowed to be a "MemoryLoad" expression. The fact this is malformed is not enforced by the type system, which is what this issue is about.

We can safely assume that a store and a load do not happen in the same statement, but this is a separate issue to your original description of separating statements that modify shared state from those that do not. What is the advantage of enforcing this more strictly through the type system?

Your proposal to make MemoryLoad a separate type of statement would require creating many more temporary variables in order to separate out every time a statement loads a value from memory and then does something else with it (often just extending the size to fit the register it is assigned to).

Furthermore, statements that read shared memory are not cleanly separated; a LocalAssign can either be a load or a purely local computation and store.

Why is this important?

I'm not proposing changing their representation in the IR per se, but it seems likely it would be easier for boogie to reason about locals than globals, I think I remember something about the cntlm example not terminating with R21 included?

This isn't something that matters to Boogie and the case you're talking about would not be solved by changing that. Boogie just treats globals that are modified by a procedure call in the same way as locals that are used as both parameter and return value for a procedure call.

l-kent commented 6 months ago

A change here I support is making it so MemoryStore is not an Expr, with its functionality being entirely merged with MemoryAssign.