gnudatalanguage / gdl

GDL - GNU Data Language
GNU General Public License v2.0
274 stars 61 forks source link

GDL crashes of PowerPC 64bit little endian with coyote #559

Open olebole opened 5 years ago

olebole commented 5 years ago

On Ubuntu, there are integration tests running on several platforms, and they discovered an issue with GDL an my Coyote unit tests:

% Compiled module: TEST_COYOTE.
% Compiled module: CGDEMODATA.
% Compiled module: CGSCALEVECTOR.
% Compiled module: FPUFIX.
% Compiled module: CGDISPLAY.
% Compiled module: CGQUERY.
% Compiled module: CGERASE.
% Compiled module: CGSETCOLORSTATE.
% Compiled module: CGGETCOLORSTATE.
% Compiled module: CGCOLOR.
% Compiled module: CGCOLOR24.
% Compiled module: CGPLOT.
% Compiled module: SETDEFAULTVALUE.
% Compiled module: CGCHECKFORSYMBOLS.
% Compiled module: CGDEFAULTCOLOR.
% Compiled module: COLORSAREIDENTICAL.
% Compiled module: CGDEFCHARSIZE.
% Compiled module: CGSYMCAT.
gdl: /build/gnudatalanguage-ZgUSeT/gnudatalanguage-0.9.9/src/gdlarray.hpp:209: T& GDLArray<T, IsPOD>::operator[](SizeT) [with T = float; bool IsPOD = true; SizeT = long long unsigned int]: Assertion `ix < sz' failed.
bash: line 1:  2888 Aborted                 (core dumped) gdl -e test_coyote

The test script is here. I will provide a minimal test script soon (and hopefully a stack trace), but maybe the asserts already rings some bells? This is a regression; 0.9.8.7 does not show this problem.

GillesDuvert commented 5 years ago

Hi @olebole , your test_coyote rocks so well that I get on linux x86_64 a few errors (trapped by coyote) AND a crash at the end. The errors are worth correcting (absent keywords, absent variables) but the crash is more serious game. So this gonna be more than just a PPC problem.

olebole commented 5 years ago

I didn't have problems on x86_64 with 0.9.9 (test was run when I updated), so it may be a regression.

GillesDuvert commented 5 years ago

My fault. I was using an old "coyote". I have no errors with the last coyote library on an intel. This is a PPC/little endian problem. Coyote generate test data using random. I suspect a problem with the dSFMT code now used by default, that is, there are some nasty #ifdefs related to endianness etc in it. Running the test with --no-dSFMT on a PPC will not use the dSMFT. If it works, this is a sure sign that indeed dSFMT (or, rather, the way it is used in GDL) is the culprit. In which case I'm willing to disable dSFMT for little-endian machines.

olebole commented 3 years ago

For reference: this was now reported as Debian#976912 (and still open with RC3).

GillesDuvert commented 3 years ago

meaning? that we need to disable dSFMT for little-endians? There was no confirmation to my query of Feb 16.

olebole commented 3 years ago

Oh, I didn't interpret that as a request. Sorry.

olebole commented 3 years ago

On RC3, I can't test it in the moment, since the version does not compile due to dSFMT errors:

[ 87%] Building CXX object src/CMakeFiles/gnudatalanguage.dir/read.cpp.o
cd "/<<PKGBUILDDIR>>/obj-powerpc64le-linux-gnu/src" && /usr/bin/c++  -DHAVE_CONFIG_H -DWXUSINGDLL -D_FILE_OFFSET_BITS=64 -D__WXGTK__ -Dgnudatalanguage_EXPORTS -I/usr/include/GraphicsMagick -I/usr/lib/powerpc64le-linux-gnu/wx/include/gtk3-unicode-3.0 -I/usr/include/wx-3.0 -I/usr/include/geotiff -I/usr/include/hdf5/serial -I/usr/include/hdf -I/usr/include/python3.8 -I/usr/lib/python3/dist-packages/numpy/core/include -I/usr/include/eigen3 -I"/<<PKGBUILDDIR>>" -I"/<<PKGBUILDDIR>>/obj-powerpc64le-linux-gnu"  -g -O2 -fdebug-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -DBUILD_DATE="\"Jun 20 2020\"" -std=gnu++11 -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC   -fopenmp -std=gnu++11 -o CMakeFiles/gnudatalanguage.dir/read.cpp.o -c "/<<PKGBUILDDIR>>/src/read.cpp"
In file included from /<<PKGBUILDDIR>>/src/randomgenerators.cpp:61:
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT.h:146:3: error: ‘vector’ does not name a type; did you mean ‘vec_or’?
  146 |   vector unsigned int s;
      |   ^~~~~~
      |   vec_or

[...more undefined vector stmts follow…]

In file included from /usr/include/c++/9/vector:67,
                 from /<<PKGBUILDDIR>>/src/typedefs.hpp:76,
                 from /<<PKGBUILDDIR>>/src/datatypes.hpp:23,
                 from /<<PKGBUILDDIR>>/src/randomgenerators.cpp:21:
/usr/include/c++/9/bits/stl_vector.h:386:11: note: ‘std::vector’ declared here
  386 |     class vector : protected _Vector_base<_Tp, _Alloc>
      |           ^~~~~~
In file included from /<<PKGBUILDDIR>>/src/randomgenerators.cpp:63:
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:57:5: error: ‘z’ was not declared in this scope
   57 |     z = a->s;
      |     ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:57:12: error: ‘lib::w128_t’ {aka ‘union lib::W128_T’} has no member named ‘s’
   57 |     z = a->s;
      |            ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:58:5: error: ‘w’ was not declared in this scope
   58 |     w = lung->s;
      |     ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:58:15: error: ‘lib::w128_t’ {aka ‘union lib::W128_T’} has no member named ‘s’
   58 |     w = lung->s;
      |               ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:59:5: error: ‘x’ was not declared in this scope
   59 |     x = vec_perm(w, (vector unsigned int)perm, perm);
      |     ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:59:28: error: expected ‘)’ before ‘unsigned’
   59 |     x = vec_perm(w, (vector unsigned int)perm, perm);
      |                     ~      ^~~~~~~~~
      |                            )
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:59:53: error: expected ‘)’ before ‘;’ token
   59 |     x = vec_perm(w, (vector unsigned int)perm, perm);
      |                 ~                                   ^
      |                                                     )
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:60:5: error: ‘y’ was not declared in this scope
   60 |     y = vec_perm(z, (vector unsigned int)sl1_perm, sl1_perm);
      |     ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:60:28: error: expected ‘)’ before ‘unsigned’
   60 |     y = vec_perm(z, (vector unsigned int)sl1_perm, sl1_perm);
      |                     ~      ^~~~~~~~~
      |                            )
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:60:61: error: expected ‘)’ before ‘;’ token
   60 |     y = vec_perm(z, (vector unsigned int)sl1_perm, sl1_perm);
      |                 ~                                           ^
      |                                                             )
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:61:20: error: ‘sl1’ was not declared in this scope
   61 |     y = vec_sll(y, sl1);
      |                    ^~~
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:62:20: error: ‘sl1_msk’ was not declared in this scope
   62 |     y = vec_and(y, sl1_msk);
      |                    ^~~~~~~
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:63:23: error: ‘lib::w128_t’ {aka ‘union lib::W128_T’} has no member named ‘s’
   63 |     w = vec_xor(x, b->s);
      |                       ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:64:21: error: invalid parameter combination for AltiVec intrinsic ‘__builtin_vec_xor’
   64 |     w = vec_xor(w, y);
      |                     ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:65:28: error: expected ‘)’ before ‘unsigned’
   65 |     x = vec_perm(w, (vector unsigned int)sr1_perm, sr1_perm);
      |                     ~      ^~~~~~~~~
      |                            )
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:65:61: error: expected ‘)’ before ‘;’ token
   65 |     x = vec_perm(w, (vector unsigned int)sr1_perm, sr1_perm);
      |                 ~                                           ^
      |                                                             )
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:66:20: error: ‘sr1’ was not declared in this scope
   66 |     x = vec_srl(x, sr1);
      |                    ^~~
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:67:20: error: ‘sr1_msk’ was not declared in this scope
   67 |     x = vec_and(x, sr1_msk);
      |                    ^~~~~~~
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:68:20: error: ‘msk1’ was not declared in this scope
   68 |     y = vec_and(w, msk1);
      |                    ^~~~
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:69:21: error: invalid parameter combination for AltiVec intrinsic ‘__builtin_vec_xor’
   69 |     z = vec_xor(z, y);
      |                     ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:70:8: error: ‘lib::w128_t’ {aka ‘union lib::W128_T’} has no member named ‘s’
   70 |     r->s = vec_xor(z, x);
      |        ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:70:24: error: invalid parameter combination for AltiVec intrinsic ‘__builtin_vec_xor’
   70 |     r->s = vec_xor(z, x);
      |                        ^
/<<PKGBUILDDIR>>/src/dSFMT/dSFMT-common.h:71:11: error: ‘lib::w128_t’ {aka ‘union lib::W128_T’} has no member named ‘s’
   71 |     lung->s = w;

That is actually a regression for RC2->RC3, which I somehow missed to report earlier (sorry).