YasserAsmi / jvar

JS inspired Variants and JSON parsing for C++
MIT License
24 stars 11 forks source link

Undefined behavior: Unaligned access to VarData #30

Open PerGraa opened 7 years ago

PerGraa commented 7 years ago

Using the UB sanitizer gives us a number of errors:

graa@luvm:~/json/jvar$ uname -a
Linux luvm 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
graa@luvm:~/json/jvar$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
graa@luvm:~/json/jvar$ cmake -DCMAKE_CXX_FLAGS="-fsanitize=undefined" .
-- CXX11 = OFF
-- CMAKE_CXX_FLAGS = -fsanitize=undefined -O3 -Wall -Wextra -Wno-unused-parameter -Wno-variadic-macros -Wno-ignored-qualifiers -D_GLIBCXX_USE_CXX11_ABI=0
-- AUTOADDPROP = OFF
-- Configuring done
-- Generating done
-- Build files have been written to: /home/graa/json/jvar
graa@luvm:~/json/jvar$ make
<snip>
graa@luvm:~/json/jvar$ ./bin/ex_arrays 
/home/graa/json/jvar/src/var.cpp:1192:52: runtime error: constructor call on misaligned address 0x000000f2bc9c for type 'struct string', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 32 00 00 00  00 00 00 00 02 00 01 00  ce 0b 00 00 00 00 00 00  02 00 01 00 2c 00 00 00
              ^ 
0 10
1 234
/home/graa/json/jvar/src/var.cpp:1333:33: runtime error: reference binding to misaligned address 0x000000f2bc9c for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 02 00 01 00  ce 0b 00 00 00 00 00 00  02 00 01 00 2c 00 00 00
              ^ 
2 Hello world
3 3022
4 44
Pop 44
Pop 3022
/home/graa/json/jvar/src/var.cpp:1120:77: runtime error: reference binding to misaligned address 0x000000f2bc9c for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 00 00 01 00  ce 0b 00 00 00 00 00 00  00 00 01 00 2c 00 00 00
              ^ 
/home/graa/json/jvar/src/var.cpp:1120:76: runtime error: constructor call on misaligned address 0x7ffc94188944 for type 'struct string', which requires 8 byte alignment
0x7ffc94188944: note: pointer points here
  05 00 01 00 00 00 00 00  8c 99 0c 02 5d 7f 00 00  80 b7 40 02 5d 7f 00 00  00 a6 7c fa 2b e4 cd 5b
              ^ 
/home/graa/json/jvar/src/var.cpp:1049:35: runtime error: member call on misaligned address 0x000000f2bc9c for type 'struct string', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 00 00 01 00  ce 0b 00 00 00 00 00 00  00 00 01 00 2c 00 00 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2943:15: runtime error: member call on misaligned address 0x000000f2bc9c for type 'struct basic_string', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 00 00 01 00  ce 0b 00 00 00 00 00 00  00 00 01 00 2c 00 00 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2698:51: runtime error: member call on misaligned address 0x000000f2bc9c for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 00 00 01 00  ce 0b 00 00 00 00 00 00  00 00 01 00 2c 00 00 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2690:29: runtime error: member access within misaligned address 0x000000f2bc9c for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 00 00 01 00  ce 0b 00 00 00 00 00 00  00 00 01 00 2c 00 00 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2943:54: runtime error: member call on misaligned address 0x000000f2bc9c for type 'struct _Alloc_hider', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 00 00 01 00  ce 0b 00 00 00 00 00 00  00 00 01 00 2c 00 00 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2669:14: runtime error: reference binding to misaligned address 0x000000f2bc9c for type 'struct <unknown>', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 00 00 01 00  ce 0b 00 00 00 00 00 00  00 00 01 00 2c 00 00 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2943:54: runtime error: reference binding to misaligned address 0x000000f2bc9c for type 'struct <unknown>', which requires 8 byte alignment
0x000000f2bc9c: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 00 00 01 00  ce 0b 00 00 00 00 00 00  00 00 01 00 2c 00 00 00
              ^ 
/home/graa/json/jvar/src/var.cpp:1049:35: runtime error: member call on misaligned address 0x7ffc94188944 for type 'struct string', which requires 8 byte alignment
0x7ffc94188944: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 5d 7f 00 00  80 b7 40 02 5d 7f 00 00  00 a6 7c fa 2b e4 cd 5b
              ^ 
/usr/include/c++/5/bits/basic_string.h:2943:15: runtime error: member call on misaligned address 0x7ffc94188944 for type 'struct basic_string', which requires 8 byte alignment
0x7ffc94188944: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 5d 7f 00 00  80 b7 40 02 5d 7f 00 00  00 a6 7c fa 2b e4 cd 5b
              ^ 
/usr/include/c++/5/bits/basic_string.h:2698:51: runtime error: member call on misaligned address 0x7ffc94188944 for type 'const struct basic_string', which requires 8 byte alignment
0x7ffc94188944: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 5d 7f 00 00  80 b7 40 02 5d 7f 00 00  00 a6 7c fa 2b e4 cd 5b
              ^ 
/usr/include/c++/5/bits/basic_string.h:2690:29: runtime error: member access within misaligned address 0x7ffc94188944 for type 'const struct basic_string', which requires 8 byte alignment
0x7ffc94188944: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 5d 7f 00 00  80 b7 40 02 5d 7f 00 00  00 a6 7c fa 2b e4 cd 5b
              ^ 
/usr/include/c++/5/bits/basic_string.h:2943:54: runtime error: member call on misaligned address 0x7ffc94188944 for type 'struct _Alloc_hider', which requires 8 byte alignment
0x7ffc94188944: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 5d 7f 00 00  80 b7 40 02 5d 7f 00 00  00 a6 7c fa 2b e4 cd 5b
              ^ 
/usr/include/c++/5/bits/basic_string.h:2669:14: runtime error: reference binding to misaligned address 0x7ffc94188944 for type 'struct <unknown>', which requires 8 byte alignment
0x7ffc94188944: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 5d 7f 00 00  80 b7 40 02 5d 7f 00 00  00 a6 7c fa 2b e4 cd 5b
              ^ 
/usr/include/c++/5/bits/basic_string.h:2943:54: runtime error: reference binding to misaligned address 0x7ffc94188944 for type 'struct <unknown>', which requires 8 byte alignment
0x7ffc94188944: note: pointer points here
  05 00 01 00 08 bd f2 00  00 00 00 00 5d 7f 00 00  80 b7 40 02 5d 7f 00 00  00 a6 7c fa 2b e4 cd 5b
              ^ 
Pop Hello world
Pop 234
Pop 10
/home/graa/json/jvar/src/var.cpp:1215:52: runtime error: constructor call on misaligned address 0x000000f2c25c for type 'struct string', which requires 8 byte alignment
0x000000f2c25c: note: pointer points here
  05 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^ 
0 123
1 23
2 can be string
3 233.2
4 false
5 -20
6 -120
/home/graa/json/jvar/src/var.cpp:175:21: runtime error: reference binding to misaligned address 0x000000f2c2cc for type 'const struct string', which requires 8 byte alignment
0x000000f2c2cc: note: pointer points here
  05 00 01 00 d8 bc f2 00  00 00 00 00 06 00 01 00  b0 c1 f2 00 00 00 00 00  05 00 01 00 88 bd f2 00
              ^ 
/home/graa/json/jvar/include/str.h:509:35: runtime error: member call on misaligned address 0x000000f2c2cc for type 'const struct string', which requires 8 byte alignment
0x000000f2c2cc: note: pointer points here
  05 00 01 00 d8 bc f2 00  00 00 00 00 06 00 01 00  b0 c1 f2 00 00 00 00 00  05 00 01 00 88 bd f2 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:3127:22: runtime error: member call on misaligned address 0x000000f2c2cc for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2c2cc: note: pointer points here
  05 00 01 00 d8 bc f2 00  00 00 00 00 06 00 01 00  b0 c1 f2 00 00 00 00 00  05 00 01 00 88 bd f2 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2698:51: runtime error: member call on misaligned address 0x000000f2c2cc for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2c2cc: note: pointer points here
  05 00 01 00 d8 bc f2 00  00 00 00 00 06 00 01 00  b0 c1 f2 00 00 00 00 00  05 00 01 00 88 bd f2 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2690:29: runtime error: member access within misaligned address 0x000000f2c2cc for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2c2cc: note: pointer points here
  05 00 01 00 d8 bc f2 00  00 00 00 00 06 00 01 00  b0 c1 f2 00 00 00 00 00  05 00 01 00 88 bd f2 00
              ^ 
/home/graa/json/jvar/include/str.h:509:23: runtime error: member call on misaligned address 0x000000f2c2cc for type 'const struct string', which requires 8 byte alignment
0x000000f2c2cc: note: pointer points here
  05 00 01 00 d8 bc f2 00  00 00 00 00 06 00 01 00  b0 c1 f2 00 00 00 00 00  05 00 01 00 88 bd f2 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:4218:23: runtime error: member call on misaligned address 0x000000f2c2cc for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2c2cc: note: pointer points here
  05 00 01 00 d8 bc f2 00  00 00 00 00 06 00 01 00  b0 c1 f2 00 00 00 00 00  05 00 01 00 88 bd f2 00
              ^ 
/usr/include/c++/5/bits/basic_string.h:2690:29: runtime error: member access within misaligned address 0x000000f2c2cc for type 'const struct basic_string', which requires 8 byte alignment
0x000000f2c2cc: note: pointer points here
  05 00 01 00 d8 bc f2 00  00 00 00 00 06 00 01 00  b0 c1 f2 00 00 00 00 00  05 00 01 00 88 bd f2 00
              ^ 
[0,"one","2.0",[0,1,[999,9999,99999],3],"four"]

The offending code seems to be in var.h:

    #pragma pack(push, 4)
    struct VarData
    {
        ushortint type;
        shortint flags;
        union
        {
            longint intData;
            double dblData;
            bool boolData;
            char strMemData[sizeof(std::string)];
            ObjArray<Variant>* arrayData;
            PropArray<Variant>* objectData;
            VarFuncObj* funcData;
            Variant* vptrData;
        };

        std::string* strData() const
        {
            assert(type == V_STRING);
            return (std::string*)&strMemData;
        }
    };
    #pragma pack(pop)

If I change the pragma push argument to 8 (bytes) on my 64-bit architecture, then all is well and the sanitizer errors disappear. But this may cause trouble on non-64-bit. Also the sizeof(std::string) is implementation/compiler dependent, so it might be better just to remove the pragma push and pop altogether?

YasserAsmi commented 7 years ago

I always run on 64-bit and don't have an issue... but I have always wondered if the pragma pack is a good idea. The only reason I had put it there was so that the Variant will be more space optimized--but it may not be worth it. Thoughts?

PerGraa commented 7 years ago

Well, there are many thoughts to be had on the subject of undefined behaviour and (un)aligned access :)

In my view it comes down to portability. Should users be able to use jvar at any given architecture/compiler setup without silent undefined bombs waiting to explode? Or do you support x86-family CPUs only?

Unaligned access will typically work fine on the x86-family CPUs, but there might be a penalty in CPU cycles. So there you have a space/time trade-off to consider: Save space using #pragma pack versus possible slowdown in unaligned access?

As a counter example, unaligned access will typically make ARM MCUs crash.

In any case the above also changes over time, since Intel and ARM engineers are working to limit the implications of unaligned access. There is a lot of material available, I like this as a starting point: http://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/