danmar / simplecpp

C++ preprocessor
BSD Zero Clause License
205 stars 81 forks source link

Wrong evaluation of ((0x7fffffffffffffffL * 2UL) < 0) #207

Open danmar opened 3 years ago

danmar commented 3 years ago

Reported here: https://sourceforge.net/p/cppcheck/discussion/development/thread/c101b0d84d/

expressions in #if might be evaluated wrongly.

Example

#if ((0x7fffffffffffffffL * 2UL) < 0)
# error LOL
#endif

simplecpp output:

$ ./simplecpp 1.c

1.c:2: #error: #error LOL
keinflue commented 3 years ago

I have seen a similar issue and would like to fix it. My idea is to replace all uses of long long for numeric values by a tagged union of long long and unsigned long long with operators overloaded to adhere to the usual arithmetic conversion rules by using the behavior of the host. This should be sufficient to correctly implement the arithmetics, assuming that the target platform's largest integer types share the size and representation with (unsigned) long long on the host.

Is that a direction you would approve of?

danmar commented 3 years ago

My idea is to replace all uses of long long for numeric values by a tagged union of long long and unsigned long long with operators overloaded to adhere to the usual arithmetic conversion rules by using the behavior of the host.

I think that is a good idea. We have discussed adding such union/class in Cppcheck also to replace the Mathlib::bigint. So well if it could be reused that would be good.

usual arithmetic conversion rules by using the behavior of the host.

ideally I think the typesizes should be configurable. to start with feel free to put some constants in the code that match the host i.e. const int sizeofInt = sizeof(int). and then we can look at how to configure those values later..

if you detect ub feel free to throw exceptions.

Is that a direction you would approve of?

The direction sounds good. :+1:

keinflue commented 3 years ago

@danmar

We have discussed adding such union/class in Cppcheck also to replace the Mathlib::bigint. So well if it could be reused that would be good.

ideally I think the typesizes should be configurable. to start with feel free to put some constants in the code that match the host i.e. const int sizeofInt = sizeof(int). and then we can look at how to configure those values later..

For the preprocessor there doesn't need to be any specification for the size of int, only for the size of intmax_t (i.e. the largest integral type according to the standard definition). As far as I can tell the standards say that the preprocessor operates only in this type plus it's unsigned variant. I am not sure that there is any platform where this is not 64bits wide. Most platforms also share the other implementation-defined behaviors such as two's complement and arithmetic right-shift of negative values etc., so I think for most platforms there is no configuration required for the preprocessor arithmetic.

For the use in the actual compilation phases the different types need to be distinguished to get the correct conversion behavior, but that makes it much more complicated. During compilation compilers also offer types larger than intmax_t, e.g. GCC's __int128, which you might want to support. So I think it might be better to use arbitrary-length integers in cppcheck instead, either using one of the specialized libraries out there or implementing a simple/naive one in cppcheck itself.

For now, my intention was to prepare a PR that only differentiates long long and unsigned long long and none of the other types since the latter would be much more work.

danmar commented 3 years ago

For the preprocessor there doesn't need to be any specification for the size of int, only for the size of intmax_t (i.e. the largest integral type according to the standard definition). As far as I can tell the standards say that the preprocessor operates only in this type plus it's unsigned variant.

ok