gnudatalanguage / gdl

GDL - GNU Data Language
GNU General Public License v2.0
277 stars 61 forks source link

Build fails with strict-aliasing violations #1852

Open eli-schwartz opened 4 months ago

eli-schwartz commented 4 months ago

I tried to build with the following *FLAGS to optimize the build: -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing

The -Werror=* flags are important to detect cases where the compiler can try to optimize based on assuming UB cannot happen, and miscompile code that has UB in it. strict-aliasing issues are always bad but LTO can make them even worse.

I got this error:

[149/169] /usr/bin/x86_64-pc-linux-gnu-g++ -DHAVE_CONFIG_H -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src -I/usr/include/tirpc -I/usr/include/ImageMagick-7 -I/usr/include/eigen3 -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6 -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src/plplotdriver -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src/whereami/src -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6_build  -pipe -march=native -fstack-protector-all -O2 -fdiagnostics-color=always -frecord-gcc-switches -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing  -Wformat -Werror=format-security -fopenmp -std=gnu++11 -fPIE -MD -MT src/CMakeFiles/gdl.dir/saverestore.cpp.o -MF src/CMakeFiles/gdl.dir/saverestore.cpp.o.d -o src/CMakeFiles/gdl.dir/saverestore.cpp.o -c /var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src/saverestore.cpp
FAILED: src/CMakeFiles/gdl.dir/saverestore.cpp.o 
/usr/bin/x86_64-pc-linux-gnu-g++ -DHAVE_CONFIG_H -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src -I/usr/include/tirpc -I/usr/include/ImageMagick-7 -I/usr/include/eigen3 -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6 -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src/plplotdriver -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src/whereami/src -I/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6_build  -pipe -march=native -fstack-protector-all -O2 -fdiagnostics-color=always -frecord-gcc-switches -flto=4 -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing  -Wformat -Werror=format-security -fopenmp -std=gnu++11 -fPIE -MD -MT src/CMakeFiles/gdl.dir/saverestore.cpp.o -MF src/CMakeFiles/gdl.dir/saverestore.cpp.o.d -o src/CMakeFiles/gdl.dir/saverestore.cpp.o -c /var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src/saverestore.cpp
/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src/saverestore.cpp: In function ‘u_int64_t lib::updateNewRecordHeader(XDR*, u_int64_t)’:
/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6/src/saverestore.cpp:464:14: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
  464 |     first = ((u_int32_t *) &next)[0];
      |             ~^~~~~~~~~~~~~~~~~~~~
cc1plus: some warnings being treated as errors

Originally reported downstream: https://bugs.gentoo.org/930966 Full build log: build.log

GillesDuvert commented 4 months ago

@eli-schwartz thanks for the report. Will see to it.

GillesDuvert commented 4 months ago

@eli-schwartz When using your compile options, I get a few other error messages surprisingly related to lines such as #pragma omp parallel num_threads(GDL_NTHREADS) arguing that "type « struct .omp_data_s.525 » breaks C++ uniqueness rule. This is of course in the OpenMP magic part --- Did you see that or does it mean that gentoo builds GDL without OpenMP? (which would be a bad idea if I may add).

GillesDuvert commented 4 months ago

@alaingdl @slayoo @pjb7687 using cmake -DCMAKE_CXX_FLAGS:STRING="-flto=4" -DCMAKE_CXX_FLAGS_RELEASE:STRING="-flto=4 -O3 -DNDEBUG" ~/gdl I get a 10% speedup for time_test4. Interesting!! thanks @eli-schwartz for the suggestion

eli-schwartz commented 4 months ago

Gentoo builds gdl with a USE="openmp" option, and openmp is default-enabled for all packages. So it's technically possible to build without it, but you'd have to go out of your way to do so.

If you look in my attached build log, the metadata header shows:

 * USE:        abi_x86_64 amd64 eigen elibc_glibc imagemagick kernel_linux openmp python_single_target_python3_12

and the cmake configure command is:

cmake -C /var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6_build/gentoo_common_config.cmake -G Ninja -DCMAKE_INSTALL_PREFIX=/usr -DMPI=OFF -DREADLINE=ON -DX11=ON -DEXPAT=ON -DPNGLIB=ON -DEIGEN3=yes -DFFTW=no -DGRIB=OFF -DGLPK=no -DHDF=no -DHDF5=no -DLIBPROJ=no -DNETCDF=no -DOPENMP=yes -DPNGLIB=no -DUDUNITS2=no -DWXWIDGETS=no -DGRAPHICSMAGICK=no -DMAGICK=yes -DTIFF=no -DGEOTIFF=no -DPYTHON_MODULE=no -DPYTHON=no -DSHAPELIB=no -DQHULL=OFF -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_TOOLCHAIN_FILE=/var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6_build/gentoo_toolchain.cmake /var/tmp/portage/dev-lang/gdl-1.0.6/work/gdl-v1.0.6

The One Definition Rule would get verified at link time, but strict-aliasing errored out at compile time, so it never got that far. That's probably why I didn't notice it. And maybe I should have tested without -Werror=strict-aliasing, just to see if there were additional link errors, but I guess I forgot. :D

eli-schwartz commented 4 months ago

Hold on, I just tried this and despite building with -DOPENMP=yes, having #define USE_OPENMP 1 in config.h, passing -fopenmp when compiling and linking all the files -- I don't get that error.

GillesDuvert commented 4 months ago

@eli-schwartz I made changes in #1860, should make the initial error disappear. At least when using -Werror=strict-aliasing it does not show anymore. (this and another error appearing in hdf5_fun.cpp) Thanks for keeping us informed of such problems, they are really helpful.