thesuhas / orca

WebAssembly Transformation Library for the Component Model
Apache License 2.0
0 stars 0 forks source link

Unique Location in WASM #2

Open thesuhas opened 1 week ago

thesuhas commented 1 week ago

Have a unique ID associated with a location in the code. Should be bidirectional, generate one from the nested location or generate the nested location from the unique location.

There can be two different ways to do this. Take this code block for example:

fn $add {
....
....
block {
    ....
    ....
    }
}

There can be two different ways to represent a unique location. Option 1:

fn $add {
.... - 0
.... - 1
block {
    .... - 2
    .... -3
    }
}

where - i represents the unique location where i is the line number.

Option 2:

fn $add {
.... - 0
.... - 1
block {
    .... - 0
    .... - 1
    }
}

And the unique location is represented as: func 1 block 1 line 0 taking advantage of the nested structure and not needing to maintain a unique global location. This nested structure can be converted to a unique location. For example hash this nested structure to get a unique identifier for this location and we must be able to get the nested structure from this unique identifier.

thesuhas commented 4 days ago

I believe we will have a unique location as a by product of the Struct itself. For example:

(module 
  fn $add {
  .... - 0
  .... - 1
  block {
      .... - 0
      .... - 1
      }
  }
)

Here the location of line marked as -0 will be Module 0 -> Function 0 -> Instruction 2 (Block) -> Instruction 0. Need to verify this with .wat files that have nested structures.

thesuhas commented 4 days ago

On doing some investigation, it looks like this is how it's happening currently.

(func (;3;) (type 6) (param i32) (result i32 i32)
      local.get 0
      if ;; label = @1
        call 0
        call 0
        call 0
      end

These instructions in the if statement are just stored as consecutive instructions followed by an end instruction

Something like: [if, call, call, call, end]

thesuhas commented 4 days ago

And for blocks, looking at this .wat code:

(func (;12;) (type 5) (param i32) (result i32)
    block (result i32)  ;; label = @1
      local.get 0
      if (result i32)  ;; label = @2
        call 0
        i32.const 1
      else
        call 0
        i32.const 0
      end
      i32.const 2
      br_if 0 (;@1;)
      i32.const 3
      return
    end)

The instructions in the block are just stored as consecutive instructions. We can identify the level of nesting by looking at these control flow instructions and end statements.

The end of a function can also be identified by another end statement. Even though this is not picked up in the .wat code, it's added by the parser.