In this simple example, if var_4 + var_5 were optimized as an available expression, all computations would be optimized as two additions instead of the two additions and one shift operations shown in the result.
Here is a similar example, but different: in this case, var4+var_4 is not supposed to be evaluated first, but is affected by ReassociatePass.
Example 2: https://godbolt.org/z/8YTevzqbT
In this example, both var_4 + var_10 and var_10 + var_12 are available expressions. But because they have a common part var_10, they are not optimized as available expressions at the same time. From the optimized IR, it seems that the compiler wants to treat var_4 + var_10 as a available expression and forgo the optimization of var_10 + var_12. However, var_4 + var_10 is not optimized because of the shift operation. This leads to the end that both var_4 + var_10 and var_10 + var_12 are not optimized, which does not look like a good result.
Hello, I noticed that the optimization of some expressions is inhibited by using shift optimization.
Let's look at a simple example: Example 1: https://godbolt.org/z/Wfno6jKEv
Clang16 -O3:
In this simple example, if
var_4 + var_5
were optimized as an available expression, all computations would be optimized as two additions instead of the two additions and one shift operations shown in the result.Here is a similar example, but different: in this case, var4+var_4 is not supposed to be evaluated first, but is affected by ReassociatePass. Example 2: https://godbolt.org/z/8YTevzqbT
Let's look at an example where the negative impact is most obvious: Example 3: https://godbolt.org/z/crxTxPd6r
Clang16 -O3:
In this example, both
var_4 + var_10
andvar_10 + var_12
are available expressions. But because they have a common partvar_10
, they are not optimized as available expressions at the same time. From the optimized IR, it seems that the compiler wants to treatvar_4 + var_10
as a available expression and forgo the optimization ofvar_10 + var_12
. However,var_4 + var_10
is not optimized because of the shift operation. This leads to the end that bothvar_4 + var_10
andvar_10 + var_12
are not optimized, which does not look like a good result.