Closed pkubaj closed 5 years ago
FreeBSD could build OpenBLAS 1 week ago.
FreeBSD could build OpenBLAS 1 week ago.
Did you build on powerpc64?
Nope, but text in issue claims general problem , not a problem porting to on their Tier2 platform.
Nope, but text in issue claims general problem , not a problem porting to on their Tier2 platform.
Please, read the title.
Is #1894 complete fix for this porting issue?
There was (or still is?) a problem with 12-prerelease that ARCH=amd64 gets set by port builder, where LAPACK that is included with OpenBLAS expects ARCH to be ar
command.
It's not complete. Once #1894 is applied, that's the problem I face.
What problem? Do you have a build log or something to show?
@pkubaj I have updated your PR with ararslan's suggestion, that should do the trick.
My problem is mentioned in the 1st message. Here's the full build log: https://talos.anongoth.pl/data/powerpc64-default/2018-12-03_09h26m07s/logs/openblas-0.3.3,1.log
It is a problem introduced by poudriere build system and should be addressed in #1899 (targetted for 0.3.5) also for architectures not yet mentioned (it was fixed up for amd64/x86_64 some time ago )
OpenBLAS builds on clean FreeBSD 12 outside poudriere, like typing make
in top of out source tree.
I already applied that fix. It makes the build system use proper Makefile. It doesn't fix a build, either in Poudriere or straight from source tree (build fails later).
The reason is missing START_ADDRESS
macro.
START_ADDRESS (or alternatively SEEK_ADDRESS) gets defined in common_power.h, but only for LINUX and AIX currently. Could you try changing the "ifdef OS_AIX" in line 789 of the file to #if defined(OS_AIX) || defined OS_FREEBSD)
to see if this is the only remaining problem ?
EDIT: After adding FreeBSD to ifdef in kernel/power/axpy.S it works, but fails at another assembly. I'll try to add FreeBSD it ifdefs and see whether it compiles.
It fixes this specific error. The reason I created this issue was because I wasn't sure whether this value is correct for FreeBSD (I'm not sure what it's responsible for).
After adding it, I have a lot of errors like:
Assembler messages:
../kernel/power/min.S:52: Error: unrecognized opcode: `prologue'
../kernel/power/min.S:53: Error: unrecognized opcode: `profcode'
../kernel/power/min.S:445: Error: unrecognized opcode: `epilogue'
../kernel/power/max.S: Assembler messages:
../kernel/power/max.S:52: Error: unrecognized opcode: `prologue'
../kernel/power/max.S:53: Error: unrecognized opcode: `profcode'
../kernel/power/max.S:445: Error: unrecognized opcode: `epilogue'
../kernel/power/axpy.S: Assembler messages:
../kernel/power/axpy.S:88: Error: unrecognized opcode: `prologue'
../kernel/power/axpy.S:89: Error: unrecognized opcode: `profcode'
../kernel/power/axpy.S:549: Error: unrecognized opcode: `epilogue'
../kernel/power/axpy.S:113: Error: unsupported relocation against INCX
../kernel/power/axpy.S:113: Error: unsupported relocation against INCX
../kernel/power/axpy.S:114: Error: unsupported relocation against INCY
../kernel/power/axpy.S:114: Error: unsupported relocation against INCY
../kernel/power/axpy.S:117: Error: unsupported relocation against PREA
../kernel/power/axpy.S:122: Error: unsupported relocation against N
../kernel/power/axpy.S:125: Error: unsupported relocation against INCX
../kernel/power/axpy.S:127: Error: unsupported relocation against INCY
../kernel/power/axpy.S:130: Error: unsupported relocation against N
../kernel/power/axpy.S:135: Error: unsupported relocation against X
../kernel/power/axpy.S:136: Error: unsupported relocation against X
../kernel/power/axpy.S:137: Error: unsupported relocation against X
../kernel/power/axpy.S:138: Error: unsupported relocation against X
../kernel/power/axpy.S:140: Error: unsupported relocation against Y
../kernel/power/axpy.S:141: Error: unsupported relocation against Y
../kernel/power/axpy.S:142: Error: unsupported relocation against Y
../kernel/power/axpy.S:143: Error: unsupported relocation against Y
../kernel/power/axpy.S:145: Error: unsupported relocation against X
../kernel/power/axpy.S:146: Error: unsupported relocation against X
../kernel/power/axpy.S:147: Error: unsupported relocation against X
../kernel/power/axpy.S:148: Error: unsupported relocation against X
../kernel/power/axpy.S:150: Error: unsupported relocation against Y
../kernel/power/axpy.S:151: Error: unsupported relocation against Y
../kernel/power/axpy.S:152: Error: unsupported relocation against Y
../kernel/power/axpy.S:153: Error: unsupported relocation against Y
../kernel/power/axpy.S:163: Error: unsupported relocation against X
../kernel/power/axpy.S:164: Error: unsupported relocation against X
../kernel/power/axpy.S:165: Error: unsupported relocation against X
../kernel/power/axpy.S:166: Error: unsupported relocation against X
../kernel/power/axpy.S:168: Error: unsupported relocation against Y
../kernel/power/axpy.S:169: Error: unsupported relocation against Y
../kernel/power/axpy.S:170: Error: unsupported relocation against Y
../kernel/power/axpy.S:171: Error: unsupported relocation against Y
../kernel/power/axpy.S:173: Error: unsupported relocation against Y
../kernel/power/axpy.S:174: Error: unsupported relocation against Y
../kernel/power/axpy.S:175: Error: unsupported relocation against Y
../kernel/power/axpy.S:176: Error: unsupported relocation against Y
../kernel/power/axpy.S:183: Error: unsupported relocation against X
../kernel/power/axpy.S:184: Error: unsupported relocation against X
../kernel/power/axpy.S:185: Error: unsupported relocation against X
../kernel/power/axpy.S:186: Error: unsupported relocation against X
../kernel/power/axpy.S:188: Error: unsupported relocation against Y
../kernel/power/axpy.S:189: Error: unsupported relocation against Y
../kernel/power/axpy.S:190: Error: unsupported relocation against Y
../kernel/power/axpy.S:191: Error: unsupported relocation against Y
../kernel/power/axpy.S:193: Error: unsupported relocation against Y
../kernel/power/axpy.S:194: Error: unsupported relocation against Y
../kernel/power/axpy.S:195: Error: unsupported relocation against Y
../kernel/power/axpy.S:196: Error: unsupported relocation against Y
../kernel/power/axpy.S:203: Error: unsupported relocation against X
../kernel/power/axpy.S:204: Error: unsupported relocation against X
../kernel/power/axpy.S:205: Error: unsupported relocation against X
../kernel/power/axpy.S:206: Error: unsupported relocation against X
../kernel/power/axpy.S:208: Error: unsupported relocation against Y
../kernel/power/axpy.S:209: Error: unsupported relocation against Y
../kernel/power/axpy.S:210: Error: unsupported relocation against Y
../kernel/power/axpy.S:211: Error: unsupported relocation against Y
../kernel/power/axpy.S:213: Error: unsupported relocation against Y
../kernel/power/axpy.S:214: Error: unsupported relocation against Y
../kernel/power/axpy.S:215: Error: unsupported relocation against Y
../kernel/power/axpy.S:216: Error: unsupported relocation against Y
../kernel/power/axpy.S:223: Error: unsupported relocation against X
../kernel/power/axpy.S:224: Error: unsupported relocation against X
../kernel/power/axpy.S:225: Error: unsupported relocation against X
../kernel/power/axpy.S:226: Error: unsupported relocation against X
../kernel/power/axpy.S:228: Error: unsupported relocation against Y
../kernel/power/axpy.S:229: Error: unsupported relocation against Y
../kernel/power/axpy.S:230: Error: unsupported relocation against Y
../kernel/power/axpy.S:231: Error: unsupported relocation against Y
../kernel/power/axpy.S:233: Error: unsupported relocation against Y
../kernel/power/axpy.S:234: Error: unsupported relocation against Y
../kernel/power/axpy.S:235: Error: unsupported relocation against Y
../kernel/power/axpy.S:236: Error: unsupported relocation against Y
../kernel/power/axpy.S:239: Error: unsupported relocation against Y
../kernel/power/axpy.S:239: Error: unsupported relocation against PREA
../kernel/power/axpy.S:241: Error: unsupported relocation against X
../kernel/power/axpy.S:241: Error: unsupported relocation against PREA
../kernel/power/axpy.S:244: Error: unsupported relocation against X
../kernel/power/axpy.S:244: Error: unsupported relocation against X
../kernel/power/axpy.S:245: Error: unsupported relocation against Y
../kernel/power/axpy.S:245: Error: unsupported relocation against Y
../kernel/power/axpy.S:261: Error: unsupported relocation against X
../kernel/power/axpy.S:262: Error: unsupported relocation against X
../kernel/power/axpy.S:263: Error: unsupported relocation against X
../kernel/power/axpy.S:264: Error: unsupported relocation against X
../kernel/power/axpy.S:266: Error: unsupported relocation against Y
../kernel/power/axpy.S:267: Error: unsupported relocation against Y
../kernel/power/axpy.S:268: Error: unsupported relocation against Y
../kernel/power/axpy.S:269: Error: unsupported relocation against Y
../kernel/power/axpy.S:276: Error: unsupported relocation against X
../kernel/power/axpy.S:277: Error: unsupported relocation against X
../kernel/power/axpy.S:278: Error: unsupported relocation against X
../kernel/power/axpy.S:279: Error: unsupported relocation against X
../kernel/power/axpy.S:281: Error: unsupported relocation against Y
../kernel/power/axpy.S:282: Error: unsupported relocation against Y
../kernel/power/axpy.S:283: Error: unsupported relocation against Y
../kernel/power/axpy.S:284: Error: unsupported relocation against Y
../kernel/power/axpy.S:286: Error: unsupported relocation against Y
../kernel/power/axpy.S:287: Error: unsupported relocation against Y
../kernel/power/axpy.S:288: Error: unsupported relocation against Y
../kernel/power/axpy.S:289: Error: unsupported relocation against Y
../kernel/power/axpy.S:296: Error: unsupported relocation against Y
../kernel/power/axpy.S:297: Error: unsupported relocation against Y
../kernel/power/axpy.S:298: Error: unsupported relocation against Y
../kernel/power/axpy.S:299: Error: unsupported relocation against Y
../kernel/power/axpy.S:306: Error: unsupported relocation against Y
../kernel/power/axpy.S:307: Error: unsupported relocation against Y
../kernel/power/axpy.S:308: Error: unsupported relocation against Y
../kernel/power/axpy.S:309: Error: unsupported relocation against Y
../kernel/power/axpy.S:311: Error: unsupported relocation against Y
../kernel/power/axpy.S:312: Error: unsupported relocation against Y
../kernel/power/axpy.S:313: Error: unsupported relocation against Y
../kernel/power/axpy.S:314: Error: unsupported relocation against Y
../kernel/power/axpy.S:316: Error: unsupported relocation against X
../kernel/power/axpy.S:316: Error: unsupported relocation against X
../kernel/power/axpy.S:317: Error: unsupported relocation against Y
../kernel/power/axpy.S:317: Error: unsupported relocation against Y
../kernel/power/axpy.S:321: Error: unsupported relocation against N
../kernel/power/axpy.S:327: Error: unsupported relocation against X
../kernel/power/axpy.S:328: Error: unsupported relocation against Y
../kernel/power/axpy.S:332: Error: unsupported relocation against Y
../kernel/power/axpy.S:333: Error: unsupported relocation against X
../kernel/power/axpy.S:333: Error: unsupported relocation against X
../kernel/power/axpy.S:334: Error: unsupported relocation against Y
../kernel/power/axpy.S:334: Error: unsupported relocation against Y
../kernel/power/axpy.S:340: Error: unsupported relocation against X
../kernel/power/axpy.S:340: Error: unsupported relocation against X
../kernel/power/axpy.S:340: Error: unsupported relocation against INCX
../kernel/power/axpy.S:341: Error: unsupported relocation against Y
../kernel/power/axpy.S:341: Error: unsupported relocation against Y
../kernel/power/axpy.S:341: Error: unsupported relocation against INCY
../kernel/power/axpy.S:342: Error: unsupported relocation against YY
../kernel/power/axpy.S:342: Error: unsupported relocation against Y
../kernel/power/axpy.S:344: Error: unsupported relocation against N
../kernel/power/axpy.S:349: Error: invalid register operand when updating
../kernel/power/axpy.S:349: Error: unsupported relocation against X
../kernel/power/axpy.S:349: Error: unsupported relocation against INCX
../kernel/power/axpy.S:350: Error: invalid register operand when updating
../kernel/power/axpy.S:350: Error: unsupported relocation against X
../kernel/power/axpy.S:350: Error: unsupported relocation against INCX
../kernel/power/axpy.S:351: Error: invalid register operand when updating
../kernel/power/axpy.S:351: Error: unsupported relocation against X
../kernel/power/axpy.S:351: Error: unsupported relocation against INCX
../kernel/power/axpy.S:352: Error: invalid register operand when updating
../kernel/power/axpy.S:352: Error: unsupported relocation against X
../kernel/power/axpy.S:352: Error: unsupported relocation against INCX
../kernel/power/axpy.S:354: Error: invalid register operand when updating
../kernel/power/axpy.S:354: Error: unsupported relocation against Y
../kernel/power/axpy.S:354: Error: unsupported relocation against INCY
../kernel/power/axpy.S:355: Error: invalid register operand when updating
../kernel/power/axpy.S:355: Error: unsupported relocation against Y
../kernel/power/axpy.S:355: Error: unsupported relocation against INCY
../kernel/power/axpy.S:356: Error: invalid register operand when updating
../kernel/power/axpy.S:356: Error: unsupported relocation against Y
../kernel/power/axpy.S:356: Error: unsupported relocation against INCY
../kernel/power/axpy.S:357: Error: invalid register operand when updating
../kernel/power/axpy.S:357: Error: unsupported relocation against Y
../kernel/power/axpy.S:357: Error: unsupported relocation against INCY
../kernel/power/axpy.S:359: Error: invalid register operand when updating
../kernel/power/axpy.S:359: Error: unsupported relocation against X
../kernel/power/axpy.S:359: Error: unsupported relocation against INCX
../kernel/power/axpy.S:360: Error: invalid register operand when updating
../kernel/power/axpy.S:360: Error: unsupported relocation against X
../kernel/power/axpy.S:360: Error: unsupported relocation against INCX
../kernel/power/axpy.S:361: Error: invalid register operand when updating
../kernel/power/axpy.S:361: Error: unsupported relocation against X
../kernel/power/axpy.S:361: Error: unsupported relocation against INCX
../kernel/power/axpy.S:362: Error: invalid register operand when updating
../kernel/power/axpy.S:362: Error: unsupported relocation against X
../kernel/power/axpy.S:362: Error: unsupported relocation against INCX
../kernel/power/axpy.S:364: Error: invalid register operand when updating
../kernel/power/axpy.S:364: Error: unsupported relocation against Y
../kernel/power/axpy.S:364: Error: unsupported relocation against INCY
../kernel/power/axpy.S:365: Error: invalid register operand when updating
../kernel/power/axpy.S:365: Error: unsupported relocation against Y
../kernel/power/axpy.S:365: Error: unsupported relocation against INCY
../kernel/power/axpy.S:366: Error: invalid register operand when updating
../kernel/power/axpy.S:366: Error: unsupported relocation against Y
../kernel/power/axpy.S:366: Error: unsupported relocation against INCY
../kernel/power/axpy.S:367: Error: invalid register operand when updating
../kernel/power/axpy.S:367: Error: unsupported relocation against Y
../kernel/power/axpy.S:367: Error: unsupported relocation against INCY
../kernel/power/axpy.S:377: Error: invalid register operand when updating
../kernel/power/axpy.S:377: Error: unsupported relocation against X
../kernel/power/axpy.S:377: Error: unsupported relocation against INCX
../kernel/power/axpy.S:378: Error: invalid register operand when updating
../kernel/power/axpy.S:378: Error: unsupported relocation against X
../kernel/power/axpy.S:378: Error: unsupported relocation against INCX
../kernel/power/axpy.S:379: Error: invalid register operand when updating
../kernel/power/axpy.S:379: Error: unsupported relocation against X
../kernel/power/axpy.S:379: Error: unsupported relocation against INCX
../kernel/power/axpy.S:380: Error: invalid register operand when updating
../kernel/power/axpy.S:380: Error: unsupported relocation against X
../kernel/power/axpy.S:380: Error: unsupported relocation against INCX
../kernel/power/axpy.S:382: Error: invalid register operand when updating
../kernel/power/axpy.S:382: Error: unsupported relocation against Y
../kernel/power/axpy.S:382: Error: unsupported relocation against INCY
../kernel/power/axpy.S:383: Error: invalid register operand when updating
../kernel/power/axpy.S:383: Error: unsupported relocation against Y
../kernel/power/axpy.S:383: Error: unsupported relocation against INCY
../kernel/power/axpy.S:384: Error: invalid register operand when updating
../kernel/power/axpy.S:384: Error: unsupported relocation against Y
../kernel/power/axpy.S:384: Error: unsupported relocation against INCY
../kernel/power/axpy.S:385: Error: invalid register operand when updating
../kernel/power/axpy.S:385: Error: unsupported relocation against Y
../kernel/power/axpy.S:385: Error: unsupported relocation against INCY
../kernel/power/axpy.S:392: Error: invalid register operand when updating
../kernel/power/axpy.S:392: Error: unsupported relocation against X
../kernel/power/axpy.S:398392: Error: unsupported relocation against INCX
../kernel/power/axpy.S:393: Error: invalid register operand when updating
../kernel/power/axpy.S:393: Error: unsupported relocation against X
../kernel/power/axpy.S:393: Error: unsupported relocation against INCX
../kernel/power/axpy.S:394: Error: invalid register operand when updating
../kernel/power/axpy.S:394: Error: unsupported relocation against X
../kernel/power/axpy.S:394: Error: unsupported relocation against INCX
../kernel/power/axpy.S:395: Error: invalid register operand when updating
../kernel/power/axpy.S:395: Error: unsupported relocation against X
../kernel/power/axpy.S:395: Error: unsupported relocation against INCX
../kernel/power/axpy.S:397: Error: invalid register operand when updating
../kernel/power/axpy.S:397: Error: unsupported relocation against Y
../kernel/power/axpy.S:397: Error: unsupported relocation against INCY
../kernel/power/axpy.S:398: Error: invalid register operand when updating
But I don't know whether it's because START_ADDRESS is bad, or it's another issue.
Since this is assembly code, my hardware may matter. I use Talos II board with POWER9 CPU.
I almost expected that - if you look in common_power.h, there are also Linux- and AIX-specific sections that define PROLOGUE
and EPILOGUE
. Luckily I see now that there are already entries for OS_DARWIN as well (both for the SEEK_ADDRESS business and the PROLOGUE/EPILOGUE thing). As OSX by all accounts derives from BSD, perhaps it will be sufficient to tack on an || defined(OS_FREEBSD)
to each and every ifdef mentioning OS_DARWIN
?
Ok, I'll do that and see whether it compiles.
I get errors with PROFCODE, PROLOGUE and EPILOGUE:
../kernel/power/iamax.S: Assembler messages:
../kernel/power/iamax.S:55: Error: unrecognized opcode: `prologue'
../kernel/power/iamax.S:56: Error: unrecognized opcode: `profcode'
<command-line>:0:0: note: this is the location of the previous definition
../kernel/power/max.S: Assembler messages:
../kernel/power/max.S:52: Error: unrecognized opcode: `prologue'
../kernel/power/max.S:53: Error: unrecognized opcode: `profcode'
<command-line>:0:0: warning: "CHAR_CNAME" redefined
<command-line>:0:0: note: this is the location of the previous definition
../kernel/power/max.S:445: Error: unrecognized opcode: `epilogue'
../kernel/power/iamax.S:802: Error: unrecognized opcode: `epilogue'
Unfortunately, after adding OS_FREEBSD to ifdef OS_AIX for the code block that defines those in common_power.h, I get:
../kernel/power/axpy.S: Assembler messages:
../kernel/power/axpy.S:88: Error: unknown pseudo-op: `.csect'
../kernel/power/axpy.S:549: Error: unknown pseudo-op: `.csect'
gmake[3]: *** [Makefile.L1:581: isamin_k.o] Error 1
../kernel/power/gemv_t.S: Assembler messages:
../kernel/power/amax.S: ../kernel/power/gemv_t.S:197: Error: unknown pseudo-op: `.csect'
Assembler messages:
../kernel/power/amax.S:52: Error: unknown pseudo-op: `.csect'
../kernel/power/gemv_t.S:287: Error: junk at end of line: `+288(1)'
../kernel/power/gemv_t.S:288: Error: junk at end of line: `+288(1)'
../kernel/power/gemv_t.S:289: Error: junk at end of line: `+288(1)'
../kernel/power/amax.S:523: Error: unknown pseudo-op: `.csect'
../kernel/power/gemv_t.S:2967: Error: unknown pseudo-op: `.csect'
gmake[3]: *** [Makefile.L1:561: isamax_k.o] Error 1
../kernel/power/iamin.S: Assembler messages:
../kernel/power/iamin.S:55: Error: unknown pseudo-op: `.csect'
../kernel/power/iamin.S:803: Error: unknown pseudo-op: `.csect'
gmake[3]: *** [Makefile.L1:498: samax_k.o] Error 1
gmake[3]: *** [Makefile.L2:226: sgemv_t.o] Error 1
gmake[3]: *** [Makefile.L1:640: saxpy_k.o] Error 1
gmake[3]: *** [Makefile.L1:601: ismax_k.o] Error 1
gmake[3]: *** [Makefile.L1:612: ismin_k.o] Error 1
../kernel/power/gemv_n.S: Assembler messages:
../kernel/power/gemv_n.S:195: Error: unknown pseudo-op: `.csect'
../kernel/power/gemv_n.S:279: Error: junk at end of line: `+280(1)'
../kernel/power/gemv_n.S:280: Error: junk at end of line: `+280(1)'
../kernel/power/gemv_n.S:281: Error: junk at end of line: `+280(1)'
../kernel/power/gemv_n.S:3095: Error: unknown pseudo-op: `.csect'
Err, what happens when you add OS_FREEBSD to the sections that mention OS_DARWIN instead (and remove it from all sections OS_AIX that we tried at first) ?
../common_power.h: Assembler messages:
../common_power.h:640: Error: unexpected end of file in macro `prologue' definition
Could be a mis-edit ? line 640 would put it after the .endmacro of the PROLOGUE though... unfortunately I have no idea what this look like for FreeBSD on Power, do you happen to know some other package that uses PPC assembly and already builds on FreeBSD ?
Which as
you are using? Should be one from GNU binutils to make gcc happy.
@pkubaj can you try building with other compilers (w NO_LAPACK=1 added, we have problem in BLAS part)
CC="clang -fintegrated-as"
and CC="clang -fno-integrated-as"
as
or gas
from gnu binutils port precedes any other as
commands in $PATH, maybe symlinking gas to asIt looks like wrong as
(not compatible with that used on AIX and Linux gcc) is getting called.
Sure best luck outcome would be one combination that works....
Could be a mis-edit ? line 640 would put it after the .endmacro of the PROLOGUE though... unfortunately I have no idea what this look like for FreeBSD on Power, do you happen to know some other package that uses PPC assembly and already builds on FreeBSD ?
Line 640 is the beginning of .macro PROLOGUE. I'm unfortunately not sure if there's anything that uses POWER assembly.
Which
as
you are using? Should be one from GNU binutils to make gcc happy.
root@talos:$/usr/ports/math/openblas$ make -V AS
/usr/local/bin/as
root@talos:$/usr/ports/math/openblas$ /usr/local/bin/as --version
GNU assembler (GNU Binutils) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `powerpc64-portbld-freebsd12.0'.
@pkubaj can you try building with other compilers (w NO_LAPACK=1 added, we have problem in BLAS part)
* system compler (clang) * same with `CC="clang -fintegrated-as"` and `CC="clang -fno-integrated-as"` * assuring `as` or `gas` from gnu binutils port precedes any other `as` commands in $PATH, maybe symlinking gas to as
It looks like wrong
as
(not compatible with that used on AIX and Linux gcc) is getting called. Sure best luck outcome would be one combination that works....
System compiler for powerpc64 is GCC 4.2, it's probably not supported by OpenBLAS.
Definitely GNU as is used (unless you overwrite AS variable).
I'm obviously out of my depth here. Guess that would leave trying the OS_LINUX version of the prologue/epilogue definitions.
gcc 4.2 should be fine (think gcc 4.1 in CentOS5), though in absence of matching old gfortran it will not be able to make complete library. It will have identical problem with as
I suspect, so it is best to try Martin's suggestion with compiler aligned with existing x86(_64) builds.
@pkubaj any luck with the OS_LINUX version of the PROLOGUE ?
Trying the same compiler as what is used for x86 is out of question, because of LLVM inability to generate correct code on FreeBSD/powerpc64 platform.
When using OS_LINUX version, I get:
../kernel/power/strmm_kernel_16x8_power8.S: Assembler messages:
../kernel/power/strmm_kernel_16x8_power8.S:271: Error: unsupported relocation against LDC
../kernel/power/strmm_kernel_16x8_power8.S:271: Error: unsupported relocation against LDC
../kernel/power/strmm_kernel_16x8_power8.S:291: Error: unsupported relocation against OFFSET
../kernel/power/strmm_logic_16x8_power8.S:41: Error: unsupported relocation against C
../kernel/power/strmm_logic_16x8_power8.S:42: Error: unsupported relocation against A
../kernel/power/strmm_logic_16x8_power8.S:43: Error: unsupported relocation against LDC
../kernel/power/strmm_logic_16x8_power8.S:44: Error: unsupported relocation against C
../kernel/power/strmm_logic_16x8_power8.S:44: Error: unsupported relocation against C
../kernel/power/strmm_logic_16x8_power8.S:47: Error: unsupported relocation against OFFSET
../kernel/power/strmm_logic_16x8_power8.S:59: Error: unsupported relocation against B
../kernel/power/strmm_logic_16x8_power8.S:197: Error: unsupported relocation against LDC
../kernel/power/strmm_logic_16x8_power8.S:197: Error: unsupported relocation against LDC
../kernel/power/strmm_logic_16x8_power8.S:197: Error: unsupported relocation against LDC
../kernel/power/strmm_logic_16x8_power8.S:197: Error: unsupported relocation against LDC
../kernel/power/strmm_logic_16x8_power8.S:197: Error: unsupported relocation against LDC
../kernel/power/strmm_logic_16x8_power8.S:197: Error: unsupported relocation against LDC
../kernel/power/strmm_logic_16x8_power8.S:197: Error: unsupported relocation against LDC
So the good news seems to be that the Linux versions of PROLOGUE/EPILOGUE in common_power.h are acceptable for FreeBSD ? Are the "unsupported relocation" errors only a sample of the compile failures you are now facing, or is it just those two strmmkernel... files that generate errors ? (From what I can find out about this assembler message, "unsupported relocation against" is basically "I do not know a register named..." - and again the affected files have various "if defined(linux) #define LDC r7" etc. at the top.)
This is only a sample, but all errors are about unsupported relocations in ../kernel/power/sgemm_logic_16x8_power8.S, ../kernel/power/sgemm_kernel_16x8_power8.S, ../kernel/power/strmm_logic_16x8_power8.S and ../kernel/power/strmm_kernel_16x8_power8.S.
I added defined(FreeBSD) to ifdef linux in kernel/power/sgemm_kernel_16x8_power8.S for:
99 #ifndef __64BIT__
100 #define A r6
101 #define B r7
102 #define C r8
103 #define LDC r9
104 #define OFFSET r10
105 #else
106 #define A r7
107 #define B r8
108 #define C r9
109 #define LDC r10
110 #define OFFSET r6
111 #endif
112 #endif
And same to kernel/power/strmm_kernel_16x8_power8.S (I know it may not sound clear, but once everything builds, I'll send a PR).
It seems that it helps with building, but then build crashes on another assembly files. Adding FreeBSD to ifdef linux seems to help.
I'll look more into it tomorrow.
Great, thanks for the feedback.
After patching all assembly files, I'm getting a segfault in test/sblat2.
#0 .sgemv_n () at ../kernel/power/gemv_n.S:317
317 STFD f0, 0 * SIZE(Y1)
@pkubaj PRO/EPI -logue is meant to save/restore registers that ABI considers immutable during normal C library call. AIX and Linux wrappers did not work out. @martin-frbg what do you think about making generic/KERNEL.CC out of arm/KERNEL.ARMv5 ?
@brada4 I am not sure that "Linux wrappers did not work out", at least it seems it did not crash in the very first test. Surely ABI differences must be documented somewhere, so if e.g. f0 needs to be saved in the prologue it should be possible to adjust it accordingly.
It is being experimented around still recently: https://lists.freebsd.org/pipermail/freebsd-ppc/2017-May/008858.html
I ran sblat2 manually, I got:
root@talos:$/usr/ports/math/openblas/work/OpenBLAS-0.3.4/test$ ./sblat2 < sblat2.dat
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Could not print backtrace: libbacktrace could not find executable to open
It is being experimented around still recently: https://lists.freebsd.org/pipermail/freebsd-ppc/2017-May/008858.html
It was done using LLVM 4.0. LLVM supports FreeBSD/powerpc64 properly only since 8.0 (current devel branch).
Is sblat2 the first (and only) test you ran, or the first one that failed while running make test
?
No, some other tests run:
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./sblat1
zblat2.f:2057:0:
$ NARGS, NC, NS
Warning: 'nargs' may be used uninitialized in this function [-Wmaybe-uninitialized]
Real BLAS Test Program Results
Test of subprogram number 1 SDOT
----- PASS -----
Test of subprogram number 2 SAXPY
----- PASS -----
Test of subprogram number 3 SROTG
----- PASS -----
Test of subprogram number 4 SROT
----- PASS -----
Test of subprogram number 5 SCOPY
----- PASS -----
Test of subprogram number 6 SSWAP
----- PASS -----
Test of subprogram number 7 SNRM2
----- PASS -----
Test of subprogram number 8 SASUM
----- PASS -----
Test of subprogram number 9 SSCAL
----- PASS -----
Test of subprogram number 10 ISAMAX
----- PASS -----
Test of subprogram number 11 SROTMG
----- PASS -----
Test of subprogram number 12 SROTM
----- PASS -----
Test of subprogram number 13 SDSDOT
----- PASS -----
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./dblat1
Real BLAS Test Program Results
Test of subprogram number 1 DDOT
FAIL
CASE N INCX INCY I COMP(I) TRUE(I) DIFFERENCE SIZE(I)
1 1 1 1 1 0.00000000D+00 0.30000000D+00 -0.3000D+00 0.3000D+00
1 2 1 1 1 0.00000000D+00 0.21000000D+00 -0.2100D+00 0.1600D+01
1 4 1 1 1 0.56000000D+00 0.62000000D+00 -0.6000D-01 0.3200D+01
1 1 2 -2 1 0.00000000D+00 0.30000000D+00 -0.3000D+00 0.3000D+00
1 2 2 -2 1 0.00000000D+00 -0.70000000D-01 0.7000D-01 0.1600D+01
1 4 2 -2 1 0.57000000D+00 0.85000000D+00 -0.2800D+00 0.3200D+01
1 1 -2 1 1 0.00000000D+00 0.30000000D+00 -0.3000D+00 0.3000D+00
1 2 -2 1 1 0.00000000D+00 -0.79000000D+00 0.7900D+00 0.1600D+01
1 4 -2 1 1 -0.96000000D+00 -0.74000000D+00 -0.2200D+00 0.3200D+01
1 1 -1 -2 1 0.00000000D+00 0.30000000D+00 -0.3000D+00 0.3000D+00
1 2 -1 -2 1 0.30000000D-01 0.33000000D+00 -0.3000D+00 0.1600D+01
1 4 -1 -2 1 0.97000000D+00 0.12700000D+01 -0.3000D+00 0.3200D+01
Test of subprogram number 2 DAXPY
----- PASS -----
Test of subprogram number 3 DROTG
----- PASS -----
Test of subprogram number 4 DROT
----- PASS -----
Test of subprogram number 5 DCOPY
----- PASS -----
Test of subprogram number 6 DSWAP
----- PASS -----
Test of subprogram number 7 DNRM2
----- PASS -----
Test of subprogram number 8 DASUM
----- PASS -----
Test of subprogram number 9 DSCAL
----- PASS -----
Test of subprogram number 10 IDAMAX
----- PASS -----
Test of subprogram number 11 DROTMG
----- PASS -----
Test of subprogram number 12 DROTM
----- PASS -----
Test of subprogram number 13 DSDOT
----- PASS -----
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./cblat1
Complex BLAS Test Program Results
Test of subprogram number 1 CDOTC
FAIL
CASE N INCX INCY MODE I COMP(I) TRUE(I) DIFFERENCE SIZE(I)
1 1 1 1 9999 1 0.00000000E+00 0.89999998E+00 -0.9000E+00 0.9000E+00
1 1 1 1 9999 2 0.00000000E+00 0.59999999E-01 -0.6000E-01 0.9000E+00
1 2 1 1 9999 1 0.00000000E+00 0.91000003E+00 -0.9100E+00 0.1630E+01
1 2 1 1 9999 2 0.00000000E+00 -0.76999998E+00 0.7700E+00 0.1730E+01
1 4 1 1 9999 1 0.42000002E+00 0.18000000E+01 -0.1380E+01 0.2900E+01
1 4 1 1 9999 2 -0.19999996E-01 -0.10000000E+00 0.8000E-01 0.2780E+01
1 1 2 -2 9999 1 0.00000000E+00 0.89999998E+00 -0.9000E+00 0.9000E+00
1 1 2 -2 9999 2 0.00000000E+00 0.59999999E-01 -0.6000E-01 0.9000E+00
1 2 2 -2 9999 1 0.00000000E+00 0.14500000E+01 -0.1450E+01 0.1630E+01
1 2 2 -2 9999 2 0.00000000E+00 0.74000001E+00 -0.7400E+00 0.1730E+01
1 4 2 -2 9999 1 -0.38999999E+00 0.20000000E+00 -0.5900E+00 0.2900E+01
1 4 2 -2 9999 2 0.82000005E+00 0.89999998E+00 -0.8000E-01 0.2780E+01
1 1 -2 1 9999 1 0.00000000E+00 0.89999998E+00 -0.9000E+00 0.9000E+00
1 1 -2 1 9999 2 0.00000000E+00 0.59999999E-01 -0.6000E-01 0.9000E+00
1 2 -2 1 9999 1 0.00000000E+00 -0.55000001E+00 0.5500E+00 0.1630E+01
1 2 -2 1 9999 2 0.00000000E+00 0.23000000E+00 -0.2300E+00 0.1730E+01
1 4 -2 1 9999 1 0.00000000E+00 0.82999998E+00 -0.8300E+00 0.2900E+01
1 4 -2 1 9999 2 0.00000000E+00 -0.38999999E+00 0.3900E+00 0.2780E+01
1 1 -1 -2 9999 1 0.00000000E+00 0.89999998E+00 -0.9000E+00 0.9000E+00
1 1 -1 -2 9999 2 0.00000000E+00 0.59999999E-01 -0.6000E-01 0.9000E+00
1 2 -1 -2 9999 1 0.00000000E+00 0.10400000E+01 -0.1040E+01 0.1630E+01
1 2 -1 -2 9999 2 0.00000000E+00 0.79000002E+00 -0.7900E+00 0.1730E+01
1 4 -1 -2 9999 1 0.72000003E+00 0.19500000E+01 -0.1230E+01 0.2900E+01
1 4 -1 -2 9999 2 0.50000006E+00 0.12200000E+01 -0.7200E+00 0.2780E+01
Test of subprogram number 2 CDOTU
----- PASS -----
Test of subprogram number 3 CAXPY
FAIL
CASE N INCX INCY MODE I COMP(I) TRUE(I) DIFFERENCE SIZE(I)
3 2 1 1 9999 1 -0.33000001E+00 0.31999999E+00 -0.6500E+00 0.1540E+01
3 2 1 1 9999 3 -0.89999998E+00 -0.15500000E+01 0.6500E+00 0.1540E+01
3 4 1 1 9999 1 -0.14800000E+01 0.31999999E+00 -0.1800E+01 0.1540E+01
3 4 1 1 9999 2 -0.21600001E+01 -0.14100000E+01 -0.7500E+00 0.1540E+01
3 4 1 1 9999 3 -0.89999998E+00 -0.15500000E+01 0.6500E+00 0.1540E+01
3 4 1 1 9999 5 0.69999999E+00 0.29999999E-01 0.6700E+00 0.1540E+01
3 4 1 1 9999 6 -0.60000002E+00 -0.88999999E+00 0.2900E+00 0.1540E+01
3 4 1 1 9999 7 0.10000000E+00 -0.38000000E+00 0.4800E+00 0.1540E+01
3 4 1 1 9999 8 -0.50000000E+00 -0.95999998E+00 0.4600E+00 0.1540E+01
3 2 2 -2 9999 1 0.60000002E+00 -0.70000000E-01 0.6700E+00 0.1540E+01
3 2 2 -2 9999 2 -0.60000002E+00 -0.88999999E+00 0.2900E+00 0.1540E+01
3 2 2 -2 9999 5 -0.25000000E+00 0.41999999E+00 -0.6700E+00 0.1540E+01
3 2 2 -2 9999 6 -0.17000000E+01 -0.14100000E+01 -0.2900E+00 0.1540E+01
3 4 2 -2 9999 1 0.60000002E+00 0.77999997E+00 -0.1800E+00 0.1540E+01
3 4 2 -2 9999 2 -0.60000002E+00 0.59999999E-01 -0.6600E+00 0.1540E+01
3 4 2 -2 9999 5 0.69999999E+00 0.59999999E-01 0.6400E+00 0.1540E+01
3 4 2 -2 9999 6 -0.60000002E+00 -0.13000000E+00 -0.4700E+00 0.1540E+01
3 4 2 -2 9999 9 -0.10000000E+00 -0.76999998E+00 0.6700E+00 0.1540E+01
3 4 2 -2 9999 10 -0.20000000E+00 -0.49000001E+00 0.2900E+00 0.1540E+01
3 4 2 -2 9999 13 -0.60999995E+00 0.51999998E+00 -0.1130E+01 0.1540E+01
3 4 2 -2 9999 14 -0.66999990E+00 -0.15100000E+01 0.8400E+00 0.1540E+01
3 2 -2 1 9999 1 -0.34999996E+00 -0.70000000E-01 -0.2800E+00 0.1540E+01
3 2 -2 1 9999 2 -0.17000000E+01 -0.88999999E+00 -0.8100E+00 0.1540E+01
3 2 -2 1 9999 3 -0.89999998E+00 -0.11799999E+01 0.2800E+00 0.1540E+01
3 2 -2 1 9999 4 0.50000000E+00 -0.31000000E+00 0.8100E+00 0.1540E+01
3 4 -2 1 9999 1 -0.80999988E+00 0.77999997E+00 -0.1590E+01 0.1540E+01
3 4 -2 1 9999 2 -0.57000005E+00 0.59999999E-01 -0.6300E+00 0.1540E+01
3 4 -2 1 9999 3 -0.89999998E+00 -0.15400000E+01 0.6400E+00 0.1540E+01
3 4 -2 1 9999 4 0.50000000E+00 0.97000003E+00 -0.4700E+00 0.1540E+01
3 4 -2 1 9999 5 0.69999999E+00 0.29999999E-01 0.6700E+00 0.1540E+01
3 4 -2 1 9999 6 -0.60000002E+00 -0.88999999E+00 0.2900E+00 0.1540E+01
3 4 -2 1 9999 7 0.10000000E+00 -0.18000001E+00 0.2800E+00 0.1540E+01
3 4 -2 1 9999 8 -0.50000000E+00 -0.13099999E+01 0.8100E+00 0.1540E+01
3 2 -1 -2 9999 1 0.60000002E+00 0.31999999E+00 0.2800E+00 0.1540E+01
3 2 -1 -2 9999 2 -0.60000002E+00 -0.14100000E+01 0.8100E+00 0.1540E+01
3 2 -1 -2 9999 5 -0.23000002E+00 0.50000001E-01 -0.2800E+00 0.1540E+01
3 2 -1 -2 9999 6 -0.14100001E+01 -0.60000002E+00 -0.8100E+00 0.1540E+01
3 4 -1 -2 9999 1 0.60000002E+00 0.31999999E+00 0.2800E+00 0.1540E+01
3 4 -1 -2 9999 2 -0.60000002E+00 -0.14100000E+01 0.8100E+00 0.1540E+01
3 4 -1 -2 9999 5 0.69999999E+00 0.50000001E-01 0.6500E+00 0.1540E+01
3 4 -1 -2 9999 9 -0.10000000E+00 -0.76999998E+00 0.6700E+00 0.1540E+01
3 4 -1 -2 9999 10 -0.20000000E+00 -0.49000001E+00 0.2900E+00 0.1540E+01
3 4 -1 -2 9999 13 -0.12800000E+01 0.31999999E+00 -0.1600E+01 0.1540E+01
3 4 -1 -2 9999 14 -0.22600000E+01 -0.11600000E+01 -0.1100E+01 0.1540E+01
Test of subprogram number 4 CCOPY
----- PASS -----
Test of subprogram number 5 CSWAP
----- PASS -----
Test of subprogram number 6 SCNRM2
----- PASS -----
Test of subprogram number 7 SCASUM
----- PASS -----
Test of subprogram number 8 CSCAL
----- PASS -----
Test of subprogram number 9 CSSCAL
----- PASS -----
Test of subprogram number 10 ICAMAX
----- PASS -----
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./zblat1
cblat2.f:2050:0:
$ NARGS, NC, NS
Warning: 'nargs' may be used uninitialized in this function [-Wmaybe-uninitialized]
Complex BLAS Test Program Results
Test of subprogram number 1 ZDOTC
FAIL
CASE N INCX INCY MODE I COMP(I) TRUE(I) DIFFERENCE SIZE(I)
1 1 1 1 9999 1 0.00000000D+00 0.90000000D+00 -0.9000D+00 0.9000D+00
1 1 1 1 9999 2 0.00000000D+00 0.60000000D-01 -0.6000D-01 0.9000D+00
1 2 1 1 9999 1 0.10000000D-01 0.91000000D+00 -0.9000D+00 0.1630D+01
1 2 1 1 9999 2 -0.83000000D+00 -0.77000000D+00 -0.6000D-01 0.1730D+01
1 4 1 1 9999 1 0.90000000D+00 0.18000000D+01 -0.9000D+00 0.2900D+01
1 4 1 1 9999 2 -0.16000000D+00 -0.10000000D+00 -0.6000D-01 0.2780D+01
1 1 2 -2 9999 1 0.00000000D+00 0.90000000D+00 -0.9000D+00 0.9000D+00
1 1 2 -2 9999 2 0.00000000D+00 0.60000000D-01 -0.6000D-01 0.9000D+00
1 2 2 -2 9999 1 0.00000000D+00 0.14500000D+01 -0.1450D+01 0.1630D+01
1 2 2 -2 9999 2 0.00000000D+00 0.74000000D+00 -0.7400D+00 0.1730D+01
1 4 2 -2 9999 1 -0.20000000D+00 0.20000000D+00 -0.4000D+00 0.2900D+01
1 4 2 -2 9999 2 0.75000000D+00 0.90000000D+00 -0.1500D+00 0.2780D+01
1 1 -2 1 9999 1 0.00000000D+00 0.90000000D+00 -0.9000D+00 0.9000D+00
1 1 -2 1 9999 2 0.00000000D+00 0.60000000D-01 -0.6000D-01 0.9000D+00
1 2 -2 1 9999 1 0.00000000D+00 -0.55000000D+00 0.5500D+00 0.1630D+01
1 2 -2 1 9999 2 0.00000000D+00 0.23000000D+00 -0.2300D+00 0.1730D+01
1 4 -2 1 9999 1 0.10800000D+01 0.83000000D+00 0.2500D+00 0.2900D+01
1 4 -2 1 9999 2 -0.12000000D+00 -0.39000000D+00 0.2700D+00 0.2780D+01
1 1 -1 -2 9999 1 0.00000000D+00 0.90000000D+00 -0.9000D+00 0.9000D+00
1 1 -1 -2 9999 2 0.00000000D+00 0.60000000D-01 -0.6000D-01 0.9000D+00
1 2 -1 -2 9999 1 0.14000000D+00 0.10400000D+01 -0.9000D+00 0.1630D+01
1 2 -1 -2 9999 2 0.73000000D+00 0.79000000D+00 -0.6000D-01 0.1730D+01
1 4 -1 -2 9999 1 0.10500000D+01 0.19500000D+01 -0.9000D+00 0.2900D+01
1 4 -1 -2 9999 2 0.11600000D+01 0.12200000D+01 -0.6000D-01 0.2780D+01
Test of subprogram number 2 ZDOTU
----- PASS -----
Test of subprogram number 3 ZAXPY
----- PASS -----
Test of subprogram number 4 ZCOPY
----- PASS -----
Test of subprogram number 5 ZSWAP
----- PASS -----
Test of subprogram number 6 DZNRM2
----- PASS -----
Test of subprogram number 7 DZASUM
----- PASS -----
Test of subprogram number 8 ZSCAL
----- PASS -----
Test of subprogram number 9 ZDSCAL
----- PASS -----
Test of subprogram number 10 IZAMAX
----- PASS -----
OMP_NUM_THREADS=2 ./sblat1
Real BLAS Test Program Results
Test of subprogram number 1 SDOT
----- PASS -----
Test of subprogram number 2 SAXPY
----- PASS -----
Test of subprogram number 3 SROTG
----- PASS -----
Test of subprogram number 4 SROT
----- PASS -----
Test of subprogram number 5 SCOPY
----- PASS -----
Test of subprogram number 6 SSWAP
----- PASS -----
Test of subprogram number 7 SNRM2
----- PASS -----
Test of subprogram number 8 SASUM
----- PASS -----
Test of subprogram number 9 SSCAL
----- PASS -----
Test of subprogram number 10 ISAMAX
----- PASS -----
Test of subprogram number 11 SROTMG
----- PASS -----
Test of subprogram number 12 SROTM
----- PASS -----
Test of subprogram number 13 SDSDOT
----- PASS -----
OMP_NUM_THREADS=2 ./dblat1
Real BLAS Test Program Results
Test of subprogram number 1 DDOT
FAIL
CASE N INCX INCY I COMP(I) TRUE(I) DIFFERENCE SIZE(I)
1 1 1 1 1 0.00000000D+00 0.30000000D+00 -0.3000D+00 0.3000D+00
1 2 1 1 1 0.00000000D+00 0.21000000D+00 -0.2100D+00 0.1600D+01
1 4 1 1 1 0.56000000D+00 0.62000000D+00 -0.6000D-01 0.3200D+01
1 1 2 -2 1 0.00000000D+00 0.30000000D+00 -0.3000D+00 0.3000D+00
1 2 2 -2 1 0.00000000D+00 -0.70000000D-01 0.7000D-01 0.1600D+01
1 4 2 -2 1 0.57000000D+00 0.85000000D+00 -0.2800D+00 0.3200D+01
1 1 -2 1 1 0.00000000D+00 0.30000000D+00 -0.3000D+00 0.3000D+00
1 2 -2 1 1 0.00000000D+00 -0.79000000D+00 0.7900D+00 0.1600D+01
1 4 -2 1 1 -0.96000000D+00 -0.74000000D+00 -0.2200D+00 0.3200D+01
1 1 -1 -2 1 0.00000000D+00 0.30000000D+00 -0.3000D+00 0.3000D+00
1 2 -1 -2 1 0.30000000D-01 0.33000000D+00 -0.3000D+00 0.1600D+01
1 4 -1 -2 1 0.97000000D+00 0.12700000D+01 -0.3000D+00 0.3200D+01
Test of subprogram number 2 DAXPY
----- PASS -----
Test of subprogram number 3 DROTG
----- PASS -----
Test of subprogram number 4 DROT
----- PASS -----
Test of subprogram number 5 DCOPY
----- PASS -----
Test of subprogram number 6 DSWAP
----- PASS -----
Test of subprogram number 7 DNRM2
----- PASS -----
Test of subprogram number 8 DASUM
----- PASS -----
Test of subprogram number 9 DSCAL
----- PASS -----
Test of subprogram number 10 IDAMAX
----- PASS -----
Test of subprogram number 11 DROTMG
----- PASS -----
Test of subprogram number 12 DROTM
----- PASS -----
Test of subprogram number 13 DSDOT
----- PASS -----
OMP_NUM_THREADS=2 ./cblat1
dblat2.f:1724:0:
$ LDA, LDAS, LJ, LX, N, NARGS, NC, NS
Warning: 'nargs' may be used uninitialized in this function [-Wmaybe-uninitialized]
Complex BLAS Test Program Results
Test of subprogram number 1 CDOTC
FAIL
CASE N INCX INCY MODE I COMP(I) TRUE(I) DIFFERENCE SIZE(I)
1 1 1 1 9999 1 0.00000000E+00 0.89999998E+00 -0.9000E+00 0.9000E+00
1 1 1 1 9999 2 0.00000000E+00 0.59999999E-01 -0.6000E-01 0.9000E+00
1 2 1 1 9999 1 0.00000000E+00 0.91000003E+00 -0.9100E+00 0.1630E+01
1 2 1 1 9999 2 0.00000000E+00 -0.76999998E+00 0.7700E+00 0.1730E+01
1 4 1 1 9999 1 0.42000002E+00 0.18000000E+01 -0.1380E+01 0.2900E+01
1 4 1 1 9999 2 -0.19999996E-01 -0.10000000E+00 0.8000E-01 0.2780E+01
1 1 2 -2 9999 1 0.00000000E+00 0.89999998E+00 -0.9000E+00 0.9000E+00
1 1 2 -2 9999 2 0.00000000E+00 0.59999999E-01 -0.6000E-01 0.9000E+00
1 2 2 -2 9999 1 0.00000000E+00 0.14500000E+01 -0.1450E+01 0.1630E+01
1 2 2 -2 9999 2 0.00000000E+00 0.74000001E+00 -0.7400E+00 0.1730E+01
1 4 2 -2 9999 1 -0.38999999E+00 0.20000000E+00 -0.5900E+00 0.2900E+01
1 4 2 -2 9999 2 0.82000005E+00 0.89999998E+00 -0.8000E-01 0.2780E+01
1 1 -2 1 9999 1 0.00000000E+00 0.89999998E+00 -0.9000E+00 0.9000E+00
1 1 -2 1 9999 2 0.00000000E+00 0.59999999E-01 -0.6000E-01 0.9000E+00
1 2 -2 1 9999 1 0.00000000E+00 -0.55000001E+00 0.5500E+00 0.1630E+01
1 2 -2 1 9999 2 0.00000000E+00 0.23000000E+00 -0.2300E+00 0.1730E+01
1 4 -2 1 9999 1 0.00000000E+00 0.82999998E+00 -0.8300E+00 0.2900E+01
1 4 -2 1 9999 2 0.00000000E+00 -0.38999999E+00 0.3900E+00 0.2780E+01
1 1 -1 -2 9999 1 0.00000000E+00 0.89999998E+00 -0.9000E+00 0.9000E+00
1 1 -1 -2 9999 2 0.00000000E+00 0.59999999E-01 -0.6000E-01 0.9000E+00
1 2 -1 -2 9999 1 0.00000000E+00 0.10400000E+01 -0.1040E+01 0.1630E+01
1 2 -1 -2 9999 2 0.00000000E+00 0.79000002E+00 -0.7900E+00 0.1730E+01
1 4 -1 -2 9999 1 0.72000003E+00 0.19500000E+01 -0.1230E+01 0.2900E+01
1 4 -1 -2 9999 2 0.50000006E+00 0.12200000E+01 -0.7200E+00 0.2780E+01
Test of subprogram number 2 CDOTU
----- PASS -----
Test of subprogram number 3 CAXPY
FAIL
CASE N INCX INCY MODE I COMP(I) TRUE(I) DIFFERENCE SIZE(I)
3 2 1 1 9999 1 -0.33000001E+00 0.31999999E+00 -0.6500E+00 0.1540E+01
3 2 1 1 9999 3 -0.89999998E+00 -0.15500000E+01 0.6500E+00 0.1540E+01
3 4 1 1 9999 1 -0.14800000E+01 0.31999999E+00 -0.1800E+01 0.1540E+01
3 4 1 1 9999 2 -0.21600001E+01 -0.14100000E+01 -0.7500E+00 0.1540E+01
3 4 1 1 9999 3 -0.89999998E+00 -0.15500000E+01 0.6500E+00 0.1540E+01
3 4 1 1 9999 5 0.69999999E+00 0.29999999E-01 0.6700E+00 0.1540E+01
3 4 1 1 9999 6 -0.60000002E+00 -0.88999999E+00 0.2900E+00 0.1540E+01
3 4 1 1 9999 7 0.10000000E+00 -0.38000000E+00 0.4800E+00 0.1540E+01
3 4 1 1 9999 8 -0.50000000E+00 -0.95999998E+00 0.4600E+00 0.1540E+01
3 2 2 -2 9999 1 0.60000002E+00 -0.70000000E-01 0.6700E+00 0.1540E+01
3 2 2 -2 9999 2 -0.60000002E+00 -0.88999999E+00 0.2900E+00 0.1540E+01
3 2 2 -2 9999 5 -0.25000000E+00 0.41999999E+00 -0.6700E+00 0.1540E+01
3 2 2 -2 9999 6 -0.17000000E+01 -0.14100000E+01 -0.2900E+00 0.1540E+01
3 4 2 -2 9999 1 0.60000002E+00 0.77999997E+00 -0.1800E+00 0.1540E+01
3 4 2 -2 9999 2 -0.60000002E+00 0.59999999E-01 -0.6600E+00 0.1540E+01
3 4 2 -2 9999 5 0.69999999E+00 0.59999999E-01 0.6400E+00 0.1540E+01
3 4 2 -2 9999 6 -0.60000002E+00 -0.13000000E+00 -0.4700E+00 0.1540E+01
3 4 2 -2 9999 9 -0.10000000E+00 -0.76999998E+00 0.6700E+00 0.1540E+01
3 4 2 -2 9999 10 -0.20000000E+00 -0.49000001E+00 0.2900E+00 0.1540E+01
3 4 2 -2 9999 13 -0.60999995E+00 0.51999998E+00 -0.1130E+01 0.1540E+01
3 4 2 -2 9999 14 -0.66999990E+00 -0.15100000E+01 0.8400E+00 0.1540E+01
3 2 -2 1 9999 1 -0.34999996E+00 -0.70000000E-01 -0.2800E+00 0.1540E+01
3 2 -2 1 9999 2 -0.17000000E+01 -0.88999999E+00 -0.8100E+00 0.1540E+01
3 2 -2 1 9999 3 -0.89999998E+00 -0.11799999E+01 0.2800E+00 0.1540E+01
3 2 -2 1 9999 4 0.50000000E+00 -0.31000000E+00 0.8100E+00 0.1540E+01
3 4 -2 1 9999 1 -0.80999988E+00 0.77999997E+00 -0.1590E+01 0.1540E+01
3 4 -2 1 9999 2 -0.57000005E+00 0.59999999E-01 -0.6300E+00 0.1540E+01
3 4 -2 1 9999 3 -0.89999998E+00 -0.15400000E+01 0.6400E+00 0.1540E+01
3 4 -2 1 9999 4 0.50000000E+00 0.97000003E+00 -0.4700E+00 0.1540E+01
3 4 -2 1 9999 5 0.69999999E+00 0.29999999E-01 0.6700E+00 0.1540E+01
3 4 -2 1 9999 6 -0.60000002E+00 -0.88999999E+00 0.2900E+00 0.1540E+01
3 4 -2 1 9999 7 0.10000000E+00 -0.18000001E+00 0.2800E+00 0.1540E+01
3 4 -2 1 9999 8 -0.50000000E+00 -0.13099999E+01 0.8100E+00 0.1540E+01
3 2 -1 -2 9999 1 0.60000002E+00 0.31999999E+00 0.2800E+00 0.1540E+01
3 2 -1 -2 9999 2 -0.60000002E+00 -0.14100000E+01 0.8100E+00 0.1540E+01
3 2 -1 -2 9999 5 -0.23000002E+00 0.50000001E-01 -0.2800E+00 0.1540E+01
3 2 -1 -2 9999 6 -0.14100001E+01 -0.60000002E+00 -0.8100E+00 0.1540E+01
3 4 -1 -2 9999 1 0.60000002E+00 0.31999999E+00 0.2800E+00 0.1540E+01
3 4 -1 -2 9999 2 -0.60000002E+00 -0.14100000E+01 0.8100E+00 0.1540E+01
3 4 -1 -2 9999 5 0.69999999E+00 0.50000001E-01 0.6500E+00 0.1540E+01
3 4 -1 -2 9999 9 -0.10000000E+00 -0.76999998E+00 0.6700E+00 0.1540E+01
3 4 -1 -2 9999 10 -0.20000000E+00 -0.49000001E+00 0.2900E+00 0.1540E+01
3 4 -1 -2 9999 13 -0.12800000E+01 0.31999999E+00 -0.1600E+01 0.1540E+01
3 4 -1 -2 9999 14 -0.22600000E+01 -0.11600000E+01 -0.1100E+01 0.1540E+01
Test of subprogram number 4 CCOPY
----- PASS -----
Test of subprogram number 5 CSWAP
----- PASS -----
Test of subprogram number 6 SCNRM2
----- PASS -----
Test of subprogram number 7 SCASUM
----- PASS -----
Test of subprogram number 8 CSCAL
----- PASS -----
Test of subprogram number 9 CSSCAL
----- PASS -----
Test of subprogram number 10 ICAMAX
----- PASS -----
OMP_NUM_THREADS=2 ./zblat1
Complex BLAS Test Program Results
Test of subprogram number 1 ZDOTC
FAIL
CASE N INCX INCY MODE I COMP(I) TRUE(I) DIFFERENCE SIZE(I)
1 1 1 1 9999 1 0.00000000D+00 0.90000000D+00 -0.9000D+00 0.9000D+00
1 1 1 1 9999 2 0.00000000D+00 0.60000000D-01 -0.6000D-01 0.9000D+00
1 2 1 1 9999 1 0.10000000D-01 0.91000000D+00 -0.9000D+00 0.1630D+01
1 2 1 1 9999 2 -0.83000000D+00 -0.77000000D+00 -0.6000D-01 0.1730D+01
1 4 1 1 9999 1 0.90000000D+00 0.18000000D+01 -0.9000D+00 0.2900D+01
1 4 1 1 9999 2 -0.16000000D+00 -0.10000000D+00 -0.6000D-01 0.2780D+01
1 1 2 -2 9999 1 0.00000000D+00 0.90000000D+00 -0.9000D+00 0.9000D+00
1 1 2 -2 9999 2 0.00000000D+00 0.60000000D-01 -0.6000D-01 0.9000D+00
1 2 2 -2 9999 1 0.00000000D+00 0.14500000D+01 -0.1450D+01 0.1630D+01
1 2 2 -2 9999 2 0.00000000D+00 0.74000000D+00 -0.7400D+00 0.1730D+01
1 4 2 -2 9999 1 -0.20000000D+00 0.20000000D+00 -0.4000D+00 0.2900D+01
1 4 2 -2 9999 2 0.75000000D+00 0.90000000D+00 -0.1500D+00 0.2780D+01
1 1 -2 1 9999 1 0.00000000D+00 0.90000000D+00 -0.9000D+00 0.9000D+00
1 1 -2 1 9999 2 0.00000000D+00 0.60000000D-01 -0.6000D-01 0.9000D+00
1 2 -2 1 9999 1 0.00000000D+00 -0.55000000D+00 0.5500D+00 0.1630D+01
1 2 -2 1 9999 2 0.00000000D+00 0.23000000D+00 -0.2300D+00 0.1730D+01
1 4 -2 1 9999 1 0.10800000D+01 0.83000000D+00 0.2500D+00 0.2900D+01
1 4 -2 1 9999 2 -0.12000000D+00 -0.39000000D+00 0.2700D+00 0.2780D+01
1 1 -1 -2 9999 1 0.00000000D+00 0.90000000D+00 -0.9000D+00 0.9000D+00
1 1 -1 -2 9999 2 0.00000000D+00 0.60000000D-01 -0.6000D-01 0.9000D+00
1 2 -1 -2 9999 1 0.14000000D+00 0.10400000D+01 -0.9000D+00 0.1630D+01
1 2 -1 -2 9999 2 0.73000000D+00 0.79000000D+00 -0.6000D-01 0.1730D+01
1 4 -1 -2 9999 1 0.10500000D+01 0.19500000D+01 -0.9000D+00 0.2900D+01
1 4 -1 -2 9999 2 0.11600000D+01 0.12200000D+01 -0.6000D-01 0.2780D+01
Test of subprogram number 2 ZDOTU
----- PASS -----
Test of subprogram number 3 ZAXPY
----- PASS -----
Test of subprogram number 4 ZCOPY
----- PASS -----
Test of subprogram number 5 ZSWAP
----- PASS -----
Test of subprogram number 6 DZNRM2
----- PASS -----
Test of subprogram number 7 DZASUM
----- PASS -----
Test of subprogram number 8 ZSCAL
----- PASS -----
Test of subprogram number 9 ZDSCAL
----- PASS -----
Test of subprogram number 10 IZAMAX
----- PASS -----
Note that I only patched those fragments of assembly sources that reported bad relocations.
Should just patch them all ('s/ifdef linux/if defined(linux) || defined(FreeBSD)/g' kernel/power/*.S)?
It probably makes sense to try and patch them all (and the gemv_n.S where you are currently getting the segmentation fault has a second "ifdef linux" section in line 255, I wonder if you already changed both of them ?) This may still not fix all problems - e.g. the wrong result from DDOT (assuming you are building for POWER8 where both ddot.c and ddot_microk_power.c do not have any os-specific definitions outside what gets imported from common_power.h).
After doing that, I get:
../kernel/power/gemv_n.S: Assembler messages:
../kernel/power/gemv_n.S:260: Error: junk at end of line: `+280(1)'
../kernel/power/gemv_n.S:261: Error: junk at end of line: `+280(1)'
../kernel/power/gemv_n.S:262: Error: junk at end of line: `+280(1)'
../kernel/power/gemv_t.S: Assembler messages:
../kernel/power/gemv_t.S:268: Error: junk at end of line: `+288(1)'
../kernel/power/gemv_t.S:269: Error: junk at end of line: `+288(1)'
../kernel/power/gemv_t.S:270: Error: junk at end of line: `+288(1)'
Something looks wrong in https://github.com/xianyi/OpenBLAS/blob/develop/kernel/power/strmm_kernel_16x8_power8.S, lines 85 and 86. Both contain macros defining STACKSIZE to different values.
New error in gemv_n.S looks as if it is evaluating the (SP) as a literal "1" which seems weird. Similar expressions are used on all supported platforms, no idea what to make of this error unless you have some hard to spot typo, special character or whatever, there. strmm_kernel_16x8_power8.S shows my fumbling to increase the stack to make room for saving the vector registers, leaving the old value to sort-of document the change in case it might be incorrect.
OK, then, something else.
Should 64BIT be defined? It's not defined by default, should I define it manually in CFLAGS?
EDIT: I found it defined in config.h, nevermind.
@pkubaj no, 64bit CPU is detected automatically. You do not want int64 interface (INTERFACE64=1) in Makefile.rule or command line, that makes library incompatible with Netlib LAPACK but permits single dimensions of arguments to exceed 4 billion, or 16GB, subject to recompiling all client code.
I'd like to revive this issue.
I got openblas (0.3.6) to build with following patches:
sed -e 's/defined(linux)/(defined(linux) || defined(__FreeBSD__))/g' -e 's/ifdef linux/if defined(linux) || defined(__FreeBSD__)/g' kernel/power/*.S
--- common_power.h.orig 2019-06-24 17:16:36 UTC
+++ common_power.h
@@ -499,7 +499,7 @@ static inline int blas_quickdivide(blasint x, blasint
#if defined(ASSEMBLER) && !defined(NEEDPARAM)
-#ifdef OS_LINUX
+#if defined(OS_LINUX) || defined(OS_FREEBSD)
#ifndef __64BIT__
#define PROLOGUE \
.section .text;\
@@ -784,7 +784,7 @@ Lmcount$lazy_ptr:
#define HALT mfspr r0, 1023
-#ifdef OS_LINUX
+#if defined(OS_LINUX) || defined(OS_FREEBSD)
#if defined(PPC440) || defined(PPC440FP2)
#undef MAX_CPU_NUMBER
#define MAX_CPU_NUMBER 1
@@ -829,7 +829,7 @@ Lmcount$lazy_ptr:
#define MAP_ANONYMOUS MAP_ANON
#endif
-#ifdef OS_LINUX
+#if defined(OS_LINUX) || defined(OS_FREEBSD)
#ifndef __64BIT__
#define FRAMESLOT(X) (((X) * 4) + 8)
#else
Now, the problem is that sblat2 and sblat3 tests freeze. I left them running over the night (just in case). The computer didn't run anything big apart from those tests.
root@talos:~ # ps auxwwf | head -n 1 ; ps auxwwf | grep sblat
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 28586 100.0 0.0 81036 5788 1 R+ 23:29 598:35.74 ./sblat2
root 32971 100.0 0.0 81664 6288 1 R+ 23:29 598:35.23 ./sblat3
Is it possible that those tests are incorrect?
Could there be an endianness issue? FreeBSD on ppc64 is big-endian (even on POWER8 and 9), while many software nowadays expects ppc64le (little-endian).
Ah yes, POWER8 kernels are currently ppc64le-only (as found out in #1997). Building for TARGET=POWER6 will probably work with your patch.
Ah yes, POWER8 kernels are currently ppc64le-only (as found out in #1997). Building for TARGET=POWER6 will probably work with your patch.
Is POWER6 necessary? I found it also compiles with POWER7.
POWER7 is mapped to POWER6 internally.
Thanks.
Since it builds and all tests pass, I assume this patch is ok. Can you commit it straight away (together with this sed) or do you require a pull requst?
PR would be easier to apply but I can generate one from your information if it is too much hassle for you.
FreeBSD currently can't build OpenBLAS due to the following error:
I can see that
START_ADDRESS
is already defined for Linux and AIX. What is this value used for? Do you know what it should be set to for FreeBSD?