I think that the right way of doing this would be to lay out the stack like this:
[ globally scoped local memory ] low
[ frame 0 ]
[ frame 0 locally scoped local ]
[ frame 1 ]
[ frame 1 locally scoped local ]
[ frame 2 ]
[ frame 2 locally scoped local ]
[ frame 3 ]
[ frame 3 locally scoped local ] high
Then a mov from a globally-scoped local variable would move the offset directly, and
a mov from a locally-scoped local variable would move the offset plus the
stack frame pointer. Then all ld.local operations would be relative to the base of the stack,
rather than the current frame as it is done now.
From SolusStu...@gmail.com on July 26, 2011 17:47:34
I think that the right way of doing this would be to lay out the stack like this:
[ globally scoped local memory ] low [ frame 0 ] [ frame 0 locally scoped local ] [ frame 1 ] [ frame 1 locally scoped local ] [ frame 2 ] [ frame 2 locally scoped local ] [ frame 3 ] [ frame 3 locally scoped local ] high
Then a mov from a globally-scoped local variable would move the offset directly, and a mov from a locally-scoped local variable would move the offset plus the stack frame pointer. Then all ld.local operations would be relative to the base of the stack, rather than the current frame as it is done now.
This is a pretty major change though.
Original issue: http://code.google.com/p/gpuocelot/issues/detail?id=55