Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[hicpp-signed-bitwise] seems to ignore constant suffices appended with ## #50459

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR51492
Status NEW
Importance P normal
Reported by Alejandro Colomar (alx.manpages@gmail.com)
Reported on 2021-08-16 05:36:24 -0700
Last modified on 2021-11-11 13:13:44 -0800
Version unspecified
Hardware PC Linux
CC alexfh@google.com, development@jonas-toth.eu, djasper@google.com, eugene.zelenko@gmail.com, fabian.wolff@alumni.ethz.ch, klimek@google.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
[[
foo.c:928:28: error: use of a signed integer operand with a binary bitwise
operator [hicpp-signed-bitwise,-warnings-as-errors]
        lutbit_rdy[tab_m / 64] |= UINT64_C(1) << (tab_m % 64);
                                  ^
/usr/include/stdint.h:262:23: note: expanded from macro 'UINT64_C'
#  define UINT64_C(c)   c ## UL
                        ^
note: expanded from here
]]

As you can see above, the constant is unsigned, due to UINT64_C(), which
appends UL with ## to the constant 1.  It seems that clang-tidy is incorrectly
ignoring that suffix, as it considers the constant to be signed.

I'm using the clang-tidy shipping with Debian Bullseye:

$ clang-tidy --version
LLVM (http://llvm.org/):
  LLVM version 11.0.1

  Optimized build.
  Default target: x86_64-pc-linux-gnu
  Host CPU: skylake
Quuxplusone commented 3 years ago
Is tab_m a signed integer? Because hicpp-signed-bitwise intentionally [1] warns
about signed integers on *either* side of a bitwise operator, which includes
the right-hand side of <<. This behavior can be controlled with the
IgnorePositiveIntegerLiterals option [2].

[1] See the discussion at https://reviews.llvm.org/D68694
[2] https://clang.llvm.org/extra/clang-tidy/checks/hicpp-signed-bitwise.html
Quuxplusone commented 3 years ago
Yes, tab_m is signed.

If that's the reason, IMO this warning is incorrect.

---

The language specifies that in the case of the shift operator, the type of the
right hand side does not intervene in the type of the resulting expression:

C17:
{
 6.5.7 Bitwise shift operators

[...]

 Semantics

 3. The integer promotions are performed on each of the operands.
    The type of the result is that of the promoted left operand.
    If the value of the right operand is negative or is greater than or equal to
    the width of the promoted left operand, the behavior is undefined.
}

---

So from that text, the standard specifies that any of the following are
undefined behavior:

1U << -1;

1U << UINT_MAX;

So, if I'm going to define a function, using a signed or an unsigned integer
makes absolutely no difference:

unsigned int foo(int x)
{
        return 1U << x;
}

and call it 'n = foo(-1);'

unsigned int bar(unsigned int x)
{
        return 1U << x;
}

and call it 'n = bar(-1);'

Both are Undefined Behavior.  So I see no benefit in warning only in one of
them.  Of course if you find a negative literal (or a huge literal), a warning
would be merited, but that will already be caught by a sane compiler.

Other than that, there's nothing a static analyzer can catch (without making it
really insane to program), IMO.

---

unsigned types should not be used as assertions that a value will never be
negative:

<http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p1428r0.pdf>
<https://google.github.io/styleguide/cppguide.html#Integer_Types>

They are only suitable for bit patterns (or modulo 2-arithmetics), which in the
case of << and >> means only the left-hand side of the expression.

---

HIC++ doesn't make it clear that the right-hand side of << and >> should be
unsigned:

<https://www.perforce.com/resources/qac/high-integrity-cpp-coding-standard-expressions>

HIC++:
{
 Use of signed operands with bitwise operators is in some cases subject to
 undefined or implementation defined behavior.
 Therefore, bitwise operators should only be used with operands of
 unsigned integral types.
}

That is clearly meant for bit patterns, which are all operands to any bitwise
operator except for the right-hand side operators to << and >>, which are not
bit patterns, but positions.  A whole different class of values.  The undefined
behavior related to them is for completely different reasons too.

The only example that refers to this case is also ambiguous:

[
 void foo (int32_t i)
 {
       int32_t r = i << -1;    // @@- Non-Compliant: undefined behavior -@@
]

Since i is already signed, it can be undefined behavior for more than one
reason, and it's not so clear to me which is the more relevant one to this rule.