smarco / WFA2-lib

WFA-lib: Wavefront alignment algorithm library v2
Other
162 stars 36 forks source link

left shift of negative value #17

Open armintoepfer opened 2 years ago

armintoepfer commented 2 years ago

If you compile the latest https://github.com/armintoepfer/aligner-testbed with gcc10.3.0 with asan and ubsan

$ meson -Db_sanitize=address,undefined
$ ninja
$ ./at ../data/clr1.txt --rounds 1 --log-level INFO --miniwfa=false --ksw2=false
| 20220421 10:31:22.817 | INFO | Number of sequence pairs : 1
../subprojects/wfa/wavefront/wavefront_extend.c:116:20: runtime error: load of misaligned address 0x7f567e1c877e for type 'uint64_t', which requires 8 byte alignment
0x7f567e1c877e: note: pointer points here
 3f 3f 3f 3f 41 54  43 54 43 54 43 54 43 41  41 43 41 41 43 41 43 43  43 41 43 47 54 41 47 47  41 47
             ^
../subprojects/wfa/wavefront/wavefront_extend.c:116:38: runtime error: load of misaligned address 0x7f567e1d6462 for type 'uint64_t', which requires 8 byte alignment
0x7f567e1d6462: note: pointer points here
 21 21  21 21 41 41 54 43 54 43  41 43 47 43 54 43 41 41  43 43 41 43 41 41 43 41  41 43 47 47 41 47
              ^
../subprojects/wfa/wavefront/wavefront_extend.c:124:13: runtime error: load of misaligned address 0x7f567e1c87b3 for type 'uint64_t', which requires 8 byte alignment
0x7f567e1c87b3: note: pointer points here
 54  43 54 43 54 43 41 41 41  43 43 41 43 41 41 43 41  41 43 47 47 41 47 47 41  47 47 41 47 47 41 41
              ^
../subprojects/wfa/wavefront/wavefront_extend.c:124:31: runtime error: load of misaligned address 0x7f567e1d649d for type 'uint64_t', which requires 8 byte alignment
0x7f567e1d649d: note: pointer points here
 54 43 54 43 41 54 43  41 41 43 41 41 43 47 41  41 43 41 41 43 47 47 41  47 47 41 47 47 41 47 47  41
             ^
../subprojects/wfa/wavefront/wavefront_backtrace.c:217:12: runtime error: left shift of negative value -1073741823
../subprojects/wfa/wavefront/wavefront_backtrace.c:186:12: runtime error: left shift of negative value -1073741824
../subprojects/wfa/wavefront/wavefront_backtrace.c:245:12: runtime error: left shift of negative value -1073741822
| 20220421 10:31:37.533 | INFO | WFA2 C    : 0 / 14s 715ms

This is on a x86 AMD EPYC 7702 with no march.

smarco commented 2 years ago

Hi,

(Un)fortunately, these are both expected. Let me explain:

load of misaligned address 0x7f567e1c87b3 for type 'uint64_t', which requires 8 byte alignment

These are caused by legitimate misaligned memory access. The extend/LCP part of the WFA algorithm requires comparing matching characters along the diagonal. For that, instead of comparing char by char, we load memory blocks of 64bits (i.e., 8 characters) and compare them using a single XOR operation. As a result, the code performs 8-Bytes memory accesses misaligned.

We could avoid these misaligned accesses by replicating the input sequences in memory (8 times with different alignments). However, I honestly deem it unnecessary for two reasons. First, modern cores/architectures execute very efficiently these misaligned loads (even for wide SIMD loads). Second, the compiler understands the target platform/core it is compiling for and can avoid error-prone situations with this information. Nevertheless, we cannot get rid of the ASAN messages (as it is correct in the report).

runtime error: left shift of negative value -1073741822

This one is also expected. To avoid performing multiple comparisons during the backtrace, our implementation piggybacks the wavefront component (i.e., M, I1, I2, D1, D2) to the wavefront offset (signed integer) in the LSB 2 bits. Then, it performs a maximum operation and retrieves the next backtrace component to go in the LSB 2 bits. To perform that procedure, we need to left shift.

Now, I understand that this hits the "Undefined Behavior" of the standard. Thanks for pointing this out, I'll try to find a standard-compliant way of doing this operation/trick.

Thanks for the report. It's quite valuable.

Cheers,