llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.08k stars 11.99k forks source link

[LV] Loop vectorizer wrongly propagates 'nuw' flag #51453

Closed llvmbot closed 2 years ago

llvmbot commented 3 years ago
Bugzilla Link 52111
Resolution FIXED
Resolved on Nov 23, 2021 06:57
Version trunk
OS Linux
Attachments Reproducer
Reporter LLVM Bugzilla Contributor
CC @ayalz,@d0k,@fhahn,@aniragil,@rofirrim

Extended Description

Hello,

I think we found a correctness issue in the loop vectorizer related to the incorrect propagation of the ‘nuw’ flag for instructions guarded by a condition. The following pseudo-code illustrates the problem:

for (i = 0; i<4; ++i) { if(i > 0) { … = a[i-1] } }

In this example, the computation of i-1 could be flagged with ‘nuw’ because the guarding condition i > 0 guarantees that the unsigned result will always be within the range [0 - 2] and, therefore, will never wrap around. However, when the loop is vectorized and predication flattens the control flow, the previous property is no longer true:

for (i = 0; i<4; i+=4) { mask = {1, 1, 1, 0}; … = masked_load(&a[i-1], mask); }

In this case, the computation of i-1 is no longer guarded by the original condition and the unsigned result will wrap-around for i=0. The vectorizer currently preserves the original ‘nuw’ value for i-1, which leads to a poison value that is later optimized away, resulting in an incorrect vector code.

Please, find attached a test case. The problematic code before vectorization is:

concat_index_from_operand_id1: ; preds = %concatenate.pivot.1.4 %20 = add nsw i64 %fusion.indvar.dim.32, -1 %21 = mul nuw nsw i64 %fusion.indvar.dim.24, 3 %22 = add nuw nsw i64 %20, %21 %23 = getelementptr inbounds [1 x [6 x float]], [1 x [6 x float]]* %11, i64 0, i64 0, i64 %22

After vectorization (opt before_lv.ll -scoped-noalias-aa -loop-vectorize -ipsccp -mcpu=skx -S -o after_lv.ll):

vector.body: ; preds = %vector.ph %14 = mul nuw nsw i64 %fusion.indvar.dim.24, 3 %15 = add nuw nsw i64 -1, %14 // nuw & -1 -> Poison! %16 = getelementptr inbounds [1 x [6 x float]], [1 x [6 x float]]* %11, i64 0, i64 0, i64 %15

I haven’t thought too much about the solution. Maybe we could drop the ‘nuw’/’nsw’ flags of uniform operations that are not immediately nested in the loop body. Perhaps, during predication?

Thanks, Diego

llvmbot commented 2 years ago

Fixed by https://reviews.llvm.org/D111846