Closed jmacc93 closed 2 years ago
Hi there and sorry for the slow reply.
Would you be able to try a build from source? The binaries were built on Ubuntu 16.04, so it's not inconceivable that there's some sort of incompatibility there. We've seen it with other distros, at least.
Slow replies are alright : )
I should have built it from source from the get-go but I was intimidated by the size of clangs package. But anyway I installed clang and built terra and get the same result with the new binary.
I should mention that I'm an absolute amateur and don't really know what I'm doing lol. But, I fed the new terra binary compiling and running the original code I posted through gdb and found it failed (I think) on the second instruction of ret1
. I compiled a similar .so in pure c and passed them both into objdump -d
and got for the range in question:
clib c ret1
0000000000001125 <lret1>:
1125: 55 push %rbp
1126: 48 89 e5 mov %rsp,%rbp
1129: 48 83 ec 10 sub $0x10,%rsp
112d: 48 89 7d f8 mov %rdi,-0x8(%rbp)
1131: f2 0f 10 05 cf 0e 00 movsd 0xecf(%rip),%xmm0 # 2008 <_fini+0xe74>
1138: 00
1139: 48 8b 45 f8 mov -0x8(%rbp),%rax
113d: 48 89 c7 mov %rax,%rdi
1140: e8 0b ff ff ff callq 1050 <lua_pushnumber@plt>
1145: b8 01 00 00 00 mov $0x1,%eax
114a: c9 leaveq
114b: c3 retq
clib terra ret1
0000000000001130 <lret1>:
1130: 50 push %rax
1131: c5 fb 10 05 c7 0e 00 vmovsd 0xec7(%rip),%xmm0 # 2000 <_fini+0xe80>
1138: 00
1139: e8 12 ff ff ff callq 1050 <lua_pushnumber@plt>
113e: b8 01 00 00 00 mov $0x1,%eax
1143: 59 pop %rcx
1144: c3 retq
1145: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
114c: 00 00 00 00
Again, I don't actually know what I'm doing so I don't really know how to interpret this, or if its even important. It is very aesthetically pleasing, and interesting, though, that the same instructions in both files end up at the same offset. And if I'm interpreting gdb correctly then ret1 is failing on instruction 1131 in the second block of instructions above.
I will try building LLVM as well when I find time and see if that does anything
What kind of machine are you building on? vmovsd
seems to be an AVX instruction so I suppose if you're on an old enough machine it might not be supported.
https://docs.oracle.com/cd/E36784_01/html/E36859/gntbd.html
I'm not quite sure what else to suggest at this point. Hypothetically LLVM should detect what kind of machine you're on regardless of how it's built, but who knows.
FYI, we had an example of this before with Skylake and LLVM 3.8 where LLVM did not correctly detect the CPU's feature set. As a test you could try building Terra with the flag DISABLE_AVX
which should shut off AVX codegen in LLVM:
https://github.com/terralang/terra/commit/c907b75c50536df6ae3458455580aa46f515ec3e
And if that makes a difference then we can try to figure out what combination of processor/LLVM/etc. is causing things to go wrong.
Hey hey! Using DISABLE_AVX
worked! No crashes and I am getting the expected result with C.lua_pushnumber(l, 1.0)
Interestingly, using the target {Triple="x86_64-unknown-unknown-unknown"}
as well as the expected {Triple="x86_64-intel-linux-elf"}
in terralib.saveobj
also works when using the original binaries.
I'm not sure if it helps, but here is my lscpu
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 39 bits physical, 48 bits virtual CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 142 Model name: Intel(R) Pentium(R) CPU 4417U @ 2.30GHz Stepping: 10 CPU MHz: 900.061 CPU max MHz: 2300.0000 CPU min MHz: 400.0000 BogoMIPS: 4608.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 2048K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust smep erms invpcid mpx rdseed smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d
I guess this CPU doesn't have AVX? That would certainly explain why those instructions crash.
By any chance have you tried newer LLVM versions? We support as recent as LLVM 9. It may be that the CPU auto-detection works better in those versions.
Otherwise we can add a workaround but I'd prefer to not do this if we can avoid it.
I'm going to close this due to lack of recent replies, but before I do I just want to note that we now support up to LLVM 14 (default to 13 in the binaries), have upgraded LuaJIT (which has received substantial fixes), and have also put quite a bit of work into portability. If the issue you hit wasn't already fixed, odds are good it's gone at this point.
At any rate, feel free to reopen if I'm wrong about that.
Feel free to remove this if its a bad issue or I'm being an idiot but here is what I found:
The following code crashes and burns:
Here is whats printed to stdout
And plain lua 5.1 simply returns
Illegal Instruction
Also, this same
ret1
function works when returning constant strings, and integers. Only doubles crash it for some reason.I'm on 64 bit debian linux
terra -v
givesRelease 1.0.0-beta2
which I got from the prebuilt binariesMy intention was to replace some of my lua libraries built in c with ones built in terra. To be specific: I was trying to remake a library in terra that I could call from lua for performance critical sections of my programs.
Great job with Terra, btw. It looks like such a great system and I am really excited about making stuff in it! And again, sorry if I just didn't do something correctly and the solution is obvious.