gtcasl / gpuocelot

GPUOCelot: A dynamic compilation framework for PTX
http://gpuocelot.gatech.edu/
BSD 3-Clause "New" or "Revised" License
280 stars 69 forks source link

Merge the locally-scoped and globally-scoped local memory stacks #54

Open jwang323 opened 9 years ago

jwang323 commented 9 years ago

From SolusStu...@gmail.com on July 26, 2011 17:47:34

I think that the right way of doing this would be to lay out the stack like this:

[ globally scoped local memory ] low [ frame 0 ] [ frame 0 locally scoped local ] [ frame 1 ] [ frame 1 locally scoped local ] [ frame 2 ] [ frame 2 locally scoped local ] [ frame 3 ] [ frame 3 locally scoped local ] high

Then a mov from a globally-scoped local variable would move the offset directly, and a mov from a locally-scoped local variable would move the offset plus the stack frame pointer. Then all ld.local operations would be relative to the base of the stack, rather than the current frame as it is done now.

This is a pretty major change though.

Original issue: http://code.google.com/p/gpuocelot/issues/detail?id=55