Open Quuxplusone opened 3 years ago
Bugzilla Link | PR49381 |
Status | NEW |
Importance | P enhancement |
Reported by | Balazs Benics (balazs.benics@sigmatechnology.se) |
Reported on | 2021-03-01 08:55:31 -0800 |
Last modified on | 2021-03-04 02:12:13 -0800 |
Version | 11.0 |
Hardware | PC All |
CC | dcoughlin@apple.com, endre.fulop@sigmatechnology.se, llvm-bugs@lists.llvm.org, whisperity@gmail.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Thanks for reaching out!
IMHO in C++ this is definitely the expected warning.
I have found this as a motivating example with a valid workaround for combining
bitmasks:
https://stackoverflow.com/questions/4920783/combining-enum-value-using-bitmask
In C this could be debatable, and if there are indeed multiple real code
examples of this sloppy enum use, then at least the language mode could be
taken into account.
I am not sure about C, but in C++, the TestEnum (with the value range [0, 4])
is ***NOT*** required to be able to represent "7".
As per http://eel.is/c++draft/dcl.enum#7
> For an enumeration whose underlying type is not fixed, the underlying type is
an integral type that can represent all the enumerator values defined in the
enumeration. [...]
7 is not defined in the enumeration. So the compiler is not required to present
you with an underlying integral type that might be able to represent 7. The
only requirement is that up to 4 shall be representable.
----
As a side-note, if you consider the bitmask enum helper functions in LLVM's own
codebase (include/llvm/ADT/BitmaskEnum.h), the idiomatic usage of bitmasks in
LLVM's code is:
enum MyFlags {
A = 0,
B = 1,
C = 2,
D = 4,
LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue =*/ D)
};
LLVM_MARK_AS_BITMASK_ENUM expands to
LLVM_BITMASK_LARGEST_ENUMERATOR = LargestValue
I am not sure what the benefit of this extra declaration is from the user's
PoV. As reading this "LARGEST_ENUMERATOR" will still give you the same value, 4
in our case. So it doesn't become the next power of 2, but there are some hint
inside about calculating powers of 2.
Then just imagine that the underlying type is fixed. The issue is the same.
This is a clear and easy bug in this checker.
I guess, if you are right then this code is suspicious:
https://github.com/llvm/llvm-project/blob/53298b8c8d74d0064c673db18addfe973b544dbb/clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h#L1529-L1545
(In reply to Balazs Benics from comment #4)
> I guess, if you are right then this code is suspicious:
> https://github.com/llvm/llvm-project/blob/
> 53298b8c8d74d0064c673db18addfe973b544dbb/clang/include/clang/StaticAnalyzer/
> Core/PathSensitive/MemRegion.h#L1529-L1545
Indeed, I've fixated on a single rule, my bad. The problem is, you have to look
at the law from above, and not just the individual case decisions.
Basically, it **is** true, that if you say an enum is between 0 and 4, then a
type shall be selected that must be able to represent [0, 4]. However, the
"unfortunate" reality is that the representation and requirements against the
selected integral type (whether a standard one or an implementation-defined
one) is that it **must** be able to represent -2^(N-1) to 2^N - 1.
[http://eel.is/c++draft/basic.fundamental].
The other requirement is that **if** you would otherwise fit into an int, the
implementation is prohibited from choosing a type larger than int (let's say
uint4096_t or something like that).
So in the case of a required range [0, 4], we need at least 3 bits. The
smallest possible integral type is char, which is **at least** 8 bits wide.
However, because you've written "0", it will be **signed** value
[http://eel.is/c++draft/dcl.enum#5.1]. Signed integrals take precedence in the
type list over their unsigned counterparts.
So what the compiler is required to **at least** select as the implicit
underlying type is "signed char", giving us the valid range [-128, 127]. And
all of these values will be valid, even though the enum did not define it. (But
not [0, 255]!)
I believe Clang should *really* be able to tell us what real range it selected
for the enum's underlying type. Although I fear Clang is lazy, and even for
such a simple enum of elements [0, 4], it will just go with "int". Which is
wider than what's *truly* required.
(In reply to Balazs Benics from comment #3)
> Then just imagine that the underlying type is fixed. The issue is the same.
> This is a clear and easy bug in this checker.
Do you mean that even in the case of
enum POSIXACLBits : unsigned char
the checker still warns for the cast of 7 ?