AIX uses a big-endian system. When we code data into uint8 arrays and then cast them into uint16, the permutation arrays are swapped in an in an order different from the way AIX OS expects them to be, like in the search functions.
Our proposed fix is to modify the array const uint8_t perm0[16] = 0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15} in impl/pq4_fast_scan.cpp to big endian in case the system is a big endian system. This file's functions pack codes into uint8 arrays. These packed code arrays are cast to uint16 in the search routines found in impl/pq4_fast_scan_search_1.cpp and impl/pq4_fast_scan_search_qbs.cpp. As a result, the code ordering changes across Linux and AIX because of endianness differences. It switches the data in the packed code arrays for AIX to how the search functions want it to by altering the permutation arrays.
In order to fix the same, we would like to seek your guidance as well as propose a fix to this issue. Kindly let us know what you think.
Summary
When we run the IVF_FASTSCAN benchmark on AIX, we are getting lesser recall values compared to other destros.
For example for IVF_FASTSCAN we get: On AIX: ======IVF1024,PQ32x4fs nprobe 1, Recall@1: 0.041200, speed: 0.048774 ms/query nprobe 2, Recall@1: 0.039400, speed: 0.090845 ms/query nprobe 4, Recall@1: 0.037600, speed: 0.175046 ms/query nprobe 6, Recall@1: 0.036500, speed: 0.255538 ms/query nprobe 8, Recall@1: 0.035000, speed: 0.336469 ms/query
on Linux: ======IVF1024,PQ32x4fs nprobe 1, Recall@1: 0.922900, speed: 0.064736 ms/query nprobe 2, Recall@1: 0.943900, speed: 0.122122 ms/query nprobe 4, Recall@1: 0.955800, speed: 0.237951 ms/query nprobe 6, Recall@1: 0.958000, speed: 0.350556 ms/query nprobe 8, Recall@1: 0.960000, speed: 0.461979 ms/query.
AIX uses a big-endian system. When we code data into uint8 arrays and then cast them into uint16, the permutation arrays are swapped in an in an order different from the way AIX OS expects them to be, like in the search functions.
Our proposed fix is to modify the array const uint8_t perm0[16] = 0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15} in impl/pq4_fast_scan.cpp to big endian in case the system is a big endian system. This file's functions pack codes into uint8 arrays. These packed code arrays are cast to uint16 in the search routines found in impl/pq4_fast_scan_search_1.cpp and impl/pq4_fast_scan_search_qbs.cpp. As a result, the code ordering changes across Linux and AIX because of endianness differences. It switches the data in the packed code arrays for AIX to how the search functions want it to by altering the permutation arrays.
In order to fix the same, we would like to seek your guidance as well as propose a fix to this issue. Kindly let us know what you think.
Platform
OS: AIX
Faiss version: Master branch
Installed from: Compiled myself
Faiss compilation options: export CC="/opt/IBM/openxlC/17.1.2/bin/ibm-clang " export CXX="/opt/IBM/openxlC/17.1.2/bin/ibm-clang++_r " export CXXFLAGS="-m64 -fopenmp -I/usr/local/include -I/opt/freeware/include" export CFLAGS="-m64 -fopenmp -I/usr/local/include -I/opt/freeware/include" export LDFLAGS="-bbigtoc"
Reproduction instructions
Run the benchmark in AIX in the benchs directory. In our case, it is python3 bench_ivf_fastscan.py