terralang / terra

Terra is a low-level system programming language that is embedded in and meta-programmed by the Lua programming language.
terralang.org
Other
2.72k stars 201 forks source link

Multiple return values to lua from terra broken on aarch64 #444

Closed Mx7f closed 2 years ago

Mx7f commented 4 years ago

I've been working on getting all terra tests passing on my NVIDIA Jetson Nano, which is an aarch64 machine. Using LLVM 6, most tests pass. But there are a handful of tests which fail, and many of them stem from the same underlying error: terra values are often incorrectly passed back to Lua.

I've boiled this down to a very simple test case:

terra f()
      return true,5.0,false
end
terra g()
     -- the uint64 has a binary representation identical to double-precision float 3.0
      return [uint64](4613937818241073152),5.0 
end
terra h()
      return 5.0,[uint64](4613937818241073152)
end
print(f()._1)
print(g()._0)
print(g()._1)
print(h()._0)
print(h()._1)

This gives correct results on my laptop. But on my Jetson Nano, I get:

5
4613937818241073152ULL
0
3
0ULL

The final three are all incorrect. h()._0 is incorrect in a revealing way, it is the double interpretation of the uint64 value that should be at h()._1. Note that (except for the explicit prints) these programs produce identical outputs in verbose mode (the functions are optimized to the same thing before code generation). And I can call these functions from terra and get correct printf() results. So I'm 99% sure the error is in the lua/terra bridge.

I'll keep pounding away at this, but I wanted to see if anyone had theories as to what is going on here, to perhaps expedite the process. Could this be a pure LuaJIT bug?

elliottslaughter commented 4 years ago

"Multiple return values" is basically syntax sugar over returning a struct. So the real question is why returning a struct is broken.

Have you tried passing/return structs to/from Terra or C functions?

The reason I ask is, returning structs is definitely broken in PPC. We had to work around it by passing around pointers (or else allocating in the callee).

If I recall, the diagnosis for this issue was that LLVM doesn't handle all of the details of the C calling convention, so you end up with this sort of code to reimplement things that are effectively baked into Clang at the moment:

https://github.com/terralang/terra/blob/master/src/tcompiler.cpp#L800-L801

And I think no one has bothered to do this for anything other than x86.

On the other hand, it's possible we're looking at something different in this case, and maybe it is a LuaJIT bug.

Mx7f commented 4 years ago

Returning structs from Terra functions (to Lua) breaks in the same way (unsurprisingly, since as you said multiple returns is just syntax sugar for returning structs).

From what you've said, I suppose the path forward for me is to go ahead and learn how the ARM64 ABI differs from x86_64 ABI and implement a new version of that CCallingConv struct to reflect the changes, maybe using clang as a reference. Not looking forward to it, but seems doable. Thanks for pointing me in the right direction!

elliottslaughter commented 4 years ago

Might be good to test a Terra -> Terra call first (or Terra -> C) to see if that breaks, just to validate that's what's going on and we're not chasing a red herring.

If I'm right, then the good news is that at least there's a workaround. But yes, fixing it requires learning enough about the ARM calling conventions differ from what a naive LLVM implementation would provide, and then handling the differences in that part of the code.

Mx7f commented 4 years ago

Terra -> Terra works, or at least I can call those functions from my first post from a different terra function and printf() the right result.

Reading up on ARMv8 calling conventions now...

Mx7f commented 4 years ago

Do you have the code for the workaround lying around anywhere? I couldn't find it on the old PPC branch: https://github.com/terralang/terra/pull/264

elliottslaughter commented 4 years ago

Unfortunately we never had a "proper" workaround for Terra, we just did the minimal hack in Regent to avoid needing a full workaround. There were only a handful of places we passed structs around. See: https://github.com/StanfordLegion/legion/commit/db9c0253af47e3063bc35c9ea5fcc02027489b28

I guess the other thing you could do is try Terra -> C or C -> Terra, since if there's a difference with the platform's calling convention that ought to show up there. On PPC, even Terra -> Terra was broken for some reason, but perhaps Arm isn't that bad.

elliottslaughter commented 2 years ago

@Mx7f do you have the ability to test this again? This should be fixed by #595.

Mx7f commented 2 years ago

I recently moved and have no idea where my Jetson Nano is (haven't finished unpacking). If I find it I will give it a shot.

elliottslaughter commented 2 years ago

If you have a code you'd be comfortable sharing, I'd be happy to test it for you.

Alternatively, you could request access to the GCC Build Farm: https://gcc.gnu.org/wiki/CompileFarm

Mx7f commented 2 years ago

Well, there is the boiled-down test case in the original issue: https://github.com/terralang/terra/issues/444#issue-615284935

I was just trying to get all the tests to pass at the time I filed this issue. I had a Thallo branch that I modified to use the pointer-passing workaround that I ran; I can't remember off-hand if there were more changes. I'll take a look tomorrow.

elliottslaughter commented 2 years ago

I had to manually add the ULL suffix to the large integers, but after that I got:

5
4613937818241073152ULL
5
5
4613937818241073152ULL

This is on 0cf6be68edccc8501991d768cc9b939982d78194 on an ARMv8 through the GCC build farm.

Mx7f commented 2 years ago

Looks good to me.

elliottslaughter commented 2 years ago

Nice. I'll close this for now, but if you ever get back to running the full code and have trouble with it, feel free to open a new issue.