lanl / vpic

Vector Particle-In-Cell (VPIC) Project
Other
151 stars 76 forks source link

Integer overflow in dumpmacros.h #126

Closed agseaton closed 3 years ago

agseaton commented 3 years ago

In the WRITE_HEADER_V0 macro below, the values 0xcafe and 0xdeadbeef are assigned to a short int and int respectively: https://github.com/lanl/vpic/blob/4e4f7c4c1ce4aa21323c8ad909c635b4604aa855/src/vpic/dumpmacros.h#L7-L16

But the guaranteed minimum upper limits of short int and int are 0x7fff and 0x7fffffff, so the constants could overflow. From what I've seen online, signed integer overflow is undefined behaviour, so could technically give different values with different compilers. Presumably all compilers handle it the same way so it's never been a problem.

Two possible solutions that don't break people's postprocessing tools:

rfbird commented 3 years ago

I think the way this works is that it's setting it to a negative number, by setting the signed bit (the perceived overflow). I'll look into it

Did you find that using some tool? Or realize it yourself?

agseaton commented 3 years ago

So gcc (9.3) warned me about it when I was checking my postprocessing tool by compiling with warning options. For some reason I don't get that warning when I compile VPIC itself, though the assignment is identical (and I made sure I was using -std=c++11 in both cases). The warning I get is:

warning: overflow in conversion from ‘int’ to ‘short int’ changes value from ‘51966’ to ‘-13570’ [-Woverflow]

Also I think you're right about how it treats the assignment, though again I don't think the standard dictates what should happen. In python: struct.unpack('h',b'\xCA\xFE'[::-1]) gives -13570, and struct.unpack follows the C conventions.

agseaton commented 3 years ago

...which I think means that we can safely replace short int with unsigned short int?

Actually yes, we definitely can: https://godbolt.org/z/rv16fc

rfbird commented 3 years ago

One reason GCC may warn you on the post-processing side, and not on the VPIC side is that VPIC does some pretty delicate casting around this stuff.

I chatted to Patrick and he also made a really interesting point -- even if it is undefined behavior (UB) having it would at least make sure the 2 tools use the same UB

If we're confident going to unsigned is safe, and works in practice, I'm OK with it. Have you tried it for an actual run?

agseaton commented 3 years ago

I guess the thing with undefined behaviour is that... ...it's undefined, so there's no guarantee that it will stay that way in the future. In this case though, it seems like there's a pretty solid consensus by the compilers on what should happen. I looked at the assembly output from a bunch of different compilers (gcc, icc, clang, msvc) on x86, ARM and different versions of those with a test function:

bool test() {
    short int a = (short int)0xcafe; // Cast here because that's what VPIC does
    unsigned short int b = 0xcafe;
    return *(unsigned short int*)(&a) == b;
}

I also tried the int a = 0xdeadbeef version. They all treat it the same way and the return value is always true. I couldn't find any that treat it differently and there are some fairly obscure compilers on compiler explorer.

So I think it's safe to say that it makes no difference, which means we can replace it or leave as is and things should be fine either way. Personally my (mild) preference would be to replace it, since having unsigned short (and unsigned int) makes the intention clearer and means that the warning will not happen either in VPIC for some future compiler, or in people's postprocessing tools.

If that sounds reasonable I'll go ahead and make the change at some point.

rfbird commented 3 years ago

Well, if its safe to change, I guess let's do it.

I normally air on the side of "lets not change things that work", but I agree in this case it's a bit scary