update June 2021: hoping for access to Deucalion's A64FX partition when ready, to see if the current draft implementation is sufficiently generic or not
update June 2024: access to Deucalion's A64FX partition confirms that the current draft implementation is not sufficiently generic
for now, it would be simply [lang/tcsds-1.2.31]\nname = lang\nversion = tcsds-1.2.31\nprefix = FJSVXTCLANGA
[ ] -SSL2* and -SCALAPACK should be used only when linking but easybuilds prepends -L to LDFLAGS variables, so they are currently in compile flags (not a problem but generate warnings that pollute the logs) (https://github.com/easybuilders/easybuild-framework/issues/3700)
[ ] also, right now -SSL2* flags are being duplicated, probably being set by both _set_blas_variables and _set_lapack_variables, should be easy to fix
[x] Rust: fails with thread 'main' panicked at 'couldn't find required command: "far"', src/bootstrap/sanity.rs:60:13; problem seems to be when "finding compilers"
[x] cc_detect tries to infer ar command name from fcc and comes up with far, but only if AR environment variable is not set, so setting it in prebuildopts
[x] fails much later with clang-7: error: unable to execute command: Killed; clang-7: error: clang frontend command failed due to signal...
this happens when Rust is building it's own LLVM
(which is not honouring EB's parallel, needs prebuildopts += "export LLVM_PARALLEL_COMPILE_JOBS=%(parallel)s && " Ninja, which can be added as builddep if it is modified to use python -bare as builddep)
[x] - make Rust use EB LLVM 12, after moving LLVM's python builddep to -bare
Python sometimes (?) fails building cryptography with error: cargo failed with code: -11
in this particular version one can use CRYPTOGRAPHY_DONT_BUILD_RUST=1 if necessary...
[x] SciPy-bundle
[x] numpy
compiler detection fails because of warning message about Fugaku's large page allocation support
this goes back to using zlib shared libraries, which requires PIC. But the warning doesn't show up when using the OS zlib, filtered out?
[x] patch lapack and blas detection to support SSL2
[x] patch f2py tests to use --fcompiler
[x] fatal error in test_cffi, extending using cffi leads to fatal error that crashes the test
[x] scipy
[x] marking some tests as xfail
[x] h5py: PR upcoming
[ ] ELPA
"The 'OPTIONAL' attribute must not be specified for the dummy argument 'success' of a procedure that has the procedure language binding specifier", unless --disable-Fortran2008-features, but the Fujitsu Compiler is supposed to support it...
new configure opt --enable-FUGAKU in 2021.005.001, also --enable-sve-512
i.e. how the environment will change in the future and if/how it differs across systems
Fugaku
universality of the lang/tcsds modules: are these specific to Fugaku or generic to other Fujitsu a64fx systems?
at Fugaku, we are using the lang module name (and one of the environment variables it sets, FJSVXTCLANGA, although this could be moved to the external module metadata file, using it to set prefix and then using get_software_root instead), in the toolchain definitions in framework, and as an external module dependency in the FCC easyconfig
response: "language environment is Fugaku specific, it cannot be used in other Fujitsu machines". So it does seem this is a "Fugaku" toolchain, not a "Fujitsu" toolchain
permanence of the lang/tcsds modules: will they always be available?
at Fugaku, old modules that were only present in compute nodes have been removed, not sure if the ones that were also published in login nodes (as is the case of the tcsds-1.2.31 version that we are using for 4.5.0/21.05) ever will
response: "The language environment (...) is retained for three versions including the latest version". Suggested that older versions are archived instead of deleted, i.e. not immediately visible but still available after some extra step, e.g. module use .... Otherwise, we'll need to remove the version pinning and revert to FCC-21.05 instead of FCC-4.5.0, as a more recent module will change the compiler version...
Isambard
environment module is fujitsu-compiler/4.3.1 (after a module use), this needs to be changed in the FCC easyconfig and in the toolchain definition...
the module doesn't set FJSVXTCLANGA, so it needs to be set manually
this path is actually all that's needed, so since the environments differ, maybe we should simply rely on a single environment variable?
large page allocation doesn't seem to be enabled/supported ("libmpg BUG!! ... Assertion '0' failed.", setting -Knolargepage but again, we need a way of always injecting this without breaking scripts that expect CC to be only the executable...
the fujitsu-compiler module adds the top level include folder to C_INCLUDE_PATH and CPLUS_INCLUDE_PATH, but that breaks -Nclang mode, the wrong headers are included (in particular arm-sve.h)
Update June 2024:
Deucalion
environment module is FJSVstclanga/1.0.21.02a (which simply adds /opt/FJSVstclanga/cp-1.0.21.02a/{bin,lib64,man} to $PATH, $LD_LIBRARY_PATH and $MANPATH, plus UCX_RNDV_THRESH=64k)
the module also doesn't set FJSVXTCLANGA, same as Isambard, so only the Fugaku module set it
the root path is actually all that's needed, so since the environments differ, maybe we should simply rely on a single environment variable?
large page allocation (libmpg) is enabled, same as Fugaku, different from Isambard, so -Klargepage, the default, can be used
numpy (<1.26) + ssl2 works very well, including multi-threaded, as long as python itself is built with the fujitsu compiler and linked with fjomplib
using gcccore as a subtoolchain instead of building everything from fcc from scratch also works
currently exploring trade-offs between "bottom-up" approach (build everything with fcc, better performance everywhere, but in most cases not by a lot, and lot more work supporting new versions) vs "top-down" approach (re-use gcccore (eventually from EESSI?)) and only rebuild what really benefits from the fujitsu compiler and libraries
possibility of adapting FlexiBLAS to support SSL2, so that even gofbf doesn't need to be rebuilt? (FFTW can easily be overriden with fujitsu's fork)
OpenMPI vs Fujitsu MPI may not be very relevant at Deucalion, since it has regular Infiniband, not TofuD like Fugaku
for things that benefit from multithreaded SSL2 called from Python (e.g. numpy/scipy/etc.), one might as well use the "bottom-up" approach, since Python itself is pretty far "down"
but for everything else, the "top-down" approach is currently looking more promising
Following up from https://github.com/easybuilders/easybuild/issues/701
update June 2021: hoping for access to Deucalion's A64FX partition when ready, to see if the current draft implementation is sufficiently generic or not
update June 2024: access to Deucalion's A64FX partition confirms that the current draft implementation is not sufficiently generic
Framework
which
to find it? (draft at https://github.com/migueldiascosta/easybuild-framework/commit/b21987eac273cfd60c5084d20a797916f4dd0f18)etc/fujitsu_external_modules_metadata.cfg
?[lang/tcsds-1.2.31]\nname = lang\nversion = tcsds-1.2.31\nprefix = FJSVXTCLANGA
-SSL2*
and-SCALAPACK
should be used only when linking but easybuilds prepends-L
toLDFLAGS
variables, so they are currently in compile flags (not a problem but generate warnings that pollute the logs) (https://github.com/easybuilders/easybuild-framework/issues/3700)-SSL2*
flags are being duplicated, probably being set by both_set_blas_variables
and_set_lapack_variables
, should be easy to fixEasyblocks
Easyconfigs
21.05
inFCC
andffmpi
easyconfigs instead of4.5.0
, since it seems we won't be able to pin the compiler version?hidden symbol
__fixunstfsi' in /usr/lib/gcc/aarch64-redhat-linux/8/libgcc.a(fixunstfsi.o) is referenced by DSO`--rtlib=compiler-rt
to$LDFLAGS
inM4
because of https://bugs.llvm.org/show_bug.cgi?id=16404, we need to the the same here (this will likely pop up again...)--rtlib=compiler-rt -lgcc_s
, because we still need other symbols from libgcc (e.g.unwind
)-bare
python dependency to avoid Rust, which itself builds LLVMtoolchainopts = {'cstd': 'gnu++11'}
:fcc
only acceptsgnuXX
,FCC
only acceptsgnu++XX
, need to parseCFLAGS
accordingly in the framework toolchain definitions: https://github.com/easybuilders/easybuild-framework/pull/3731CC=cc
, needs CC="$CC" buildopt (added to all other UnZip easyconfigs in https://github.com/easybuilders/easybuild-easyconfigs/pull/12887)thread 'main' panicked at 'couldn't find required command: "far"', src/bootstrap/sanity.rs:60:13
; problem seems to be when "finding compilers"cc_detect
tries to inferar
command name fromfcc
and comes up withfar
, but only if AR environment variable is not set, so setting it inprebuildopts
clang-7: error: unable to execute command: Killed; clang-7: error: clang frontend command failed due to signal
...parallel
, needsNinja, which can be added as builddep if it is modified to use pythonprebuildopts += "export LLVM_PARALLEL_COMPILE_JOBS=%(parallel)s && "
-bare
as builddep) [x] - make Rust use EB LLVM 12, after moving LLVM's python builddep to-bare
cryptography
witherror: cargo failed with code: -11
CRYPTOGRAPHY_DONT_BUILD_RUST=1
if necessary...fortranpythonpackage
easyblock to pass--fcompiler=fujitsu
: https://github.com/easybuilders/easybuild-easyblocks/pull/2434--fcompiler
--disable-Fortran2008-features
, but the Fujitsu Compiler is supposed to support it...--enable-FUGAKU
in2021.005.001
, also--enable-sve-512
berkeleygw
easyblock: https://github.com/easybuilders/easybuild-easyblocks/pull/2428Questions about Fujitsu ecosystem
i.e. how the environment will change in the future and if/how it differs across systems
Fugaku
lang/tcsds
modules: are these specific to Fugaku or generic to other Fujitsu a64fx systems?lang
module name (and one of the environment variables it sets,FJSVXTCLANGA
, although this could be moved to the external module metadata file, using it to setprefix
and then usingget_software_root
instead), in the toolchain definitions in framework, and as an external module dependency in theFCC
easyconfiglang/tcsds
modules: will they always be available?tcsds-1.2.31
version that we are using for4.5.0
/21.05
) ever willFCC-21.05
instead ofFCC-4.5.0
, as a more recent module will change the compiler version...Isambard
fujitsu-compiler/4.3.1
(after a module use), this needs to be changed in the FCC easyconfig and in the toolchain definition...FJSVXTCLANGA
, so it needs to be set manually-Knolargepage
but again, we need a way of always injecting this without breaking scripts that expect CC to be only the executable...fujitsu-compiler
module adds the top level include folder toC_INCLUDE_PATH
andCPLUS_INCLUDE_PATH
, but that breaks-Nclang
mode, the wrong headers are included (in particulararm-sve.h
)Update June 2024:
Deucalion
FJSVstclanga/1.0.21.02a
(which simply adds/opt/FJSVstclanga/cp-1.0.21.02a/{bin,lib64,man}
to$PATH
,$LD_LIBRARY_PATH
and$MANPATH
, plusUCX_RNDV_THRESH=64k
)FJSVXTCLANGA
, same as Isambard, so only the Fugaku module set it-Klargepage
, the default, can be used