DNNL_aarch64 generates two types of JIT functions for FP32 operations using Xbyak, Xbyak_aarch64, and Xbyak_Translator on ARMv8 with SVE processors
Reference implementations by C++ run other than those above operations and unsupported parameter sets. They output correct result, but run somewhat slow.
Bfloat16 support : Currently, DNNL_aarch64 does not support
CPU | Fujitsu FX1000 / 700 |
---|---|
OS | RedHad 8.1 / Centos 8.1 |
Compiler | Fujitsu compiler / GCC 8.3.1 20190507 |
Currently, DNNL_aarch64 is intended to run on CPUs of ARMv8-A with SVE. If you run DNNL_aarch64 on CPUs without SVE, it will be aborted because of undefined instruction exception.
git clone https://github.com/fujitsu/dnnl_aarch64.git
cd dnnl_aarch64/
git submodule update --init --recursive
mkdir third_party/build_xed_aarch64
pushd third_party/build_xed_aarch64/
../xbyak_translator_aarch64/translator/third_party/xed/mfile.py --shared examples install
cd kits/
ln -sf xed-install-base-* xed
popd
mkdir build_aarch64
cd build_aarch64/
cmake ..
make -j40
Using BLAS (Optional)
LD_LIBRARY_PATH
Add the following options to cmake command
BLAS | Option |
---|---|
SSL2 | -DWITH_BLAS=ssl2 (only with FUJITSU compiler) |
openblas | -DWITH_BLAS=openblas |
cd tests/gtests
MKLDNN_VERBOSE=1 MKLDNN_JIT_DUMP=1 ./test_reorder
Copyright FUJITSU LIMITED 2019-2020
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Date | Version | Remarks |
---|---|---|
December 11, 2019 | 0.9.0_base_0.19 | First public release version. |
May 31, 2020 | 1.0.0_base_0.21.2 | Update |
Copyright FUJITSU LIMITED 2019-2020