Open PHHargrove opened 10 years ago
Confirmed the issue on my VM too. I was not able to duplicate it on Linux 32 machine, and I thought this was good as I can compare the code. It turns out that code is completely different as Linux uses xmm registers in the generated code while FreeBSD does not. Error can be duplicated with -O0 and only one thread which is good for debugging.
Error can be duplicated with this code:
shared [5] int a_blk5[10*THREADS];
shared [5] int *ptr_to_blk5;
void
test18()
{
int got;
int expected;
/* bug 52: upc_resetphase unimplemented */
ptr_to_blk5 = upc_resetphase (&a_blk5[1]);
got = upc_phaseof (ptr_to_blk5);
expected = 0;
upc_barrier;
}
I think the issue is related to an optimization where FreeBSD does not save/use the frame pointer. Instead, stack pointer is used for the register spill:
movl %eax, 20(%esp) # 4-byte Spill
calll upc_resetphase
[...]
subl $4, %esp
[...]
movl 20(%esp), %eax # 4-byte Reload
Looks like code generation bug, and we might be able to create a C test case for this.
I did try to create a test case for this without any luck.
Today I retested clang-upc on openbsd-i386 configured using --with-upc-pts=struct
.
The failures below were observed at runtime and are not present with --with-upc-pts=packed
.
run.rpt:[bugzilla/bug276_st04] 0sec 20150224_150819 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[bugzilla/bug276] 0sec 20150224_150820 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[guts_main/resetphase1_st04] 0sec 20150224_151129 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[guts_main/resetphase1] 0sec 20150224_151130 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[guts_main/resetphase2_st04] 0sec 20150224_151130 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[guts_main/resetphase2] 0sec 20150224_151130 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[intrepid/test18_st04] 0sec 20150224_151306 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[intrepid/test18] 1sec 20150224_151307 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[bugzilla/bug88_st02] 0sec 20150224_152645 FAILED (CRASH=SIGTERM/NEW)
run.rpt:[bugzilla/bug88] 1sec 20150224_152646 FAILED (CRASH=SIGTERM/NEW)
All failures produced the same message:
[testname]: UPC error: Thread number in shared address is out of range
There was no difference between -g
and -O
in terms of which tests failed (though the -g
run did have one test time-out).
I have again tried the struct PTS representation on OpenBSD and this error is still present.
I now have OpenBSD testers for clang-upc on both amd64 and i386, and have chosen to configure with
--with-upc-pts=struct
for more coverage.In conducting the initial "smoke test" run of the Intrepid suite I encountered failures of
test18
only on the i386 system. In a debug build I get the following failure:While a non-debug build gets a SEGV instead:
Outputs above show the static-threads builds of the test, but the dynamic threads cases fail in the same manner.
I went on to investigate other 32-bit platforms and found the majority to fail test18 with the struct PTS.
On an x86 build on FreeBSD I get a SEGV:
On an "-m32" build on Mac OS X I see a different failure mode:
On an x86 build NetBSD I don't see any error.
I don't presently have any 32-bit builds for Linux.
In all of the cases reported above as failing, I have verified that there is no error with the packed PTS representation.