Closed malaterre closed 3 weeks ago
For reference [zandonai](https://buildd.debian.org/status/architecture.php?a=s390x&suite=sid&buildd=buildd-zandonai)
reports that Z14 is supported:
[113/184] : && /usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -DHWY_BROKEN_EMU128=0 -Wdate-time -D_FORTIFY_SOURCE=2 -flto=auto -ffat-lto-objects -Wl,-z,relro -Wl,-z,now -fPIE -pie CMakeFiles/hwy_list_targets.dir/hwy/tests/list_targets.cc.o -o hwy_list_targets -Wl,-rpath,/<<PKGBUILDDIR>>/obj-s390x-linux-gnu libhwy.so.1.2.0 && cd /<<PKGBUILDDIR>>/obj-s390x-linux-gnu && /<<PKGBUILDDIR>>/obj-s390x-linux-gnu/hwy_list_targets || ( exit 0 )
Config: emu128:0 scalar:0 static:0 all_attain:0 is_test:0
Compiled HWY_TARGETS: EMU128
HWY_ATTAINABLE_TARGETS: EMU128
HWY_BASELINE_TARGETS: EMU128
HWY_STATIC_TARGET: EMU128
HWY_BROKEN_TARGETS:
HWY_DISABLED_TARGETS:
Current CPU supports: Z15 Z14 EMU128 SCALAR
I started the first build of hwy on s390x/Z14. The build fails with:
FAILED: CMakeFiles/copy_test.dir/hwy/contrib/algo/copy_test.cc.o /usr/bin/c++ -DHWY_SHARED_DEFINE -DTOOLCHAIN_MISS_ASM_HWCAP_H -I"/<>" -g -O2 -ffile-prefix-map=/<>=. -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -DHWY_BROKEN_EMU128=0 -march=z15 -mzvector -Wdate-time -D_FORTIFY_SOURCE=2 -O3 -DNDEBUG -std=c++17 -fPIE -fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined -DDATE="redacted" -DTIMESTAMP="redacted" -DTIME="redacted" -fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor -Wcast-align -fmath-errno -fno-exceptions -Wno-psabi -Werror -DHWY_IS_TEST=1 -DGTEST_HAS_PTHREAD=1 -MD -MT CMakeFiles/copy_test.dir/hwy/contrib/algo/copy_test.cc.o -MF CMakeFiles/copy_test.dir/hwy/contrib/algo/copy_test.cc.o.d -o CMakeFiles/copy_test.dir/hwy/contrib/algo/copy_test.cc.o -c '/<>/hwy/contrib/algo/copy_test.cc' In file included from /<>/hwy/aligned_allocator.h:32, from /<>/hwy/contrib/algo/copy_test.cc:18: /<>/hwy/base.h: In function ‘hwy::N_Z14::Load<hwy::N_Z14::Simd<unsigned short, 1ul, 0>, (void)0, unsigned short>(hwy::N_Z14::Simd<unsigned short, 1ul, 0>, unsigned short const)decltype (Zero((hwy::N_Z14::Simd<unsigned short, 1ul, 0>)()))’: /<>/hwy/base.h:336:14: error: inlining failed in call to ‘always_inline’ ‘hwy::CopyBytes<2ul, unsigned short, unsigned short>(unsigned short const, unsigned short)void’: target specific option mismatch 336 | HWY_API void CopyBytes(const From HWY_RESTRICT from, To HWY_RESTRICT to) { | ^
~~ In file included from /<>/hwy/highway.h:586, from /<>/hwy/contrib/algo/copy_test.cc:24, from /<>/hwy/foreach_target.h:290, from /<>/hwy/contrib/algo/copy_test.cc:23: /<>/hwy/ops/ppc_vsx-inl.h:697:26: note: called from here 697 | CopyBytes<d.MaxBytes()>(p, &bits); | ~~~~~^~~~~~ref:
The above compiler error happens due to the -march=z15
option, and the -march=z15
option assumes that you are targeting a z15 or z16 mainframe.
It is possible to work around the above compiler error by disabling the HWY_Z14 target if you only need to support z15 or later.
@johnplatts I can build and run PPC9 & PPC8 hwy on ppc64el.
What compiler options should I use to build and run Z15 & Z14 hwy on s390x ?
@johnplatts I can build and run PPC9 & PPC8 hwy on ppc64el.
What compiler options should I use to build and run Z15 & Z14 hwy on s390x ?
To compile for Z14 or later, use the -march=z14 -mzvector
compiler options.
To compile for Z14 or later, use the
-march=z14 -mzvector
compiler options.
I had to read that sentence twice. Anyway that did the trick, I see the Z15 tests:
kudos for the work, all tests are passing !
@johnplatts
One note though, could you confirm this:
obj-*/examples/hwy_benchmark
Measurement failed: overhead 10 < 12
MeasureClosure failed.
F(x)->2*x^2, F(3) = 18.0
------------------------ Z15
dot: 3456: 0.383 (+/- 0.001)
delta: 3456: 0.775 (+/- 0.000)
F(x)->2*x^2, F(3) = 18.0
------------------------ Z14
dot: 3456: 0.088 (+/- 0.001)
No worries, MeasureClosure can spuriously 'fail'. It just indicates various sources of noise were too large. For example, it could be that the thread migrated to a different core and thus the timer went off a bit. It's fine to ignore that.
I started the first build of hwy on s390x/Z14. The build fails with:
FAILED: CMakeFiles/copy_test.dir/hwy/contrib/algo/copy_test.cc.o /usr/bin/c++ -DHWY_SHARED_DEFINE -DTOOLCHAIN_MISS_ASM_HWCAP_H -I"/<>" -g -O2 -ffile-prefix-map=/<>=. -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -DHWY_BROKEN_EMU128=0 -march=z15 -mzvector -Wdate-time -D_FORTIFY_SOURCE=2 -O3 -DNDEBUG -std=c++17 -fPIE -fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined -DDATE=\"redacted\" -DTIMESTAMP=\"redacted\" -DTIME=\"redacted\" -fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor -Wcast-align -fmath-errno -fno-exceptions -Wno-psabi -Werror -DHWY_IS_TEST=1 -DGTEST_HAS_PTHREAD=1 -MD -MT CMakeFiles/copy_test.dir/hwy/contrib/algo/copy_test.cc.o -MF CMakeFiles/copy_test.dir/hwy/contrib/algo/copy_test.cc.o.d -o CMakeFiles/copy_test.dir/hwy/contrib/algo/copy_test.cc.o -c '/<>/hwy/contrib/algo/copy_test.cc'
In file included from /<>/hwy/aligned_allocator.h:32,
from /<>/hwy/contrib/algo/copy_test.cc:18:
/<>/hwy/base.h: In function ‘hwy::N_Z14::Load<hwy::N_Z14::Simd<unsigned short, 1ul, 0>, (void)0, unsigned short>(hwy::N_Z14::Simd<unsigned short, 1ul, 0>, unsigned short const)decltype (Zero((hwy::N_Z14::Simd<unsigned short, 1ul, 0>)()))’:
/<>/hwy/base.h:336:14: error: inlining failed in call to ‘always_inline’ ‘hwy::CopyBytes<2ul, unsigned short, unsigned short>(unsigned short const, unsigned short)void’: target specific option mismatch
336 | HWY_API void CopyBytes(const From HWY_RESTRICT from, To HWY_RESTRICT to) {
| ^>/hwy/highway.h:586,
from /<>/hwy/contrib/algo/copy_test.cc:24,
from /<>/hwy/foreach_target.h:290,
from /<>/hwy/contrib/algo/copy_test.cc:23:
/<>/hwy/ops/ppc_vsx-inl.h:697:26: note: called from here
697 | CopyBytes<d.MaxBytes()>(p, &bits);
|
~~~~ In file included from /<~~~~~^~~~ref: