pow tests fail with -march=native -mtune=native flags and g++

andrjohns commented 4 years ago

Description

When adding CXXFLAGS += -march=native -mtune=native the mix tests for pow fail due to all tests resulting in zero gradients, but only when compiling with g++. The tests compile and pass under clang.

No other tests (so far) fail, so it looks like it might be some weird interplay between the binary vectorisation framework and the pow function specifically.

I've replicated the failures on two linux machines, both with gcc 9.30 and clang 10

Current Version:

v3.3.0

SteveBronder commented 4 years ago

Can confirm on ubuntu 20.04 with g++-10. It looks like a near zero problem?

./test/unit/math/expect_near_rel.hpp:33: Failure
The difference between x1 and x2 is 3.8049220956170699e-07, which exceeds tol_val, where
x1 evaluates to 3.8049220956170699e-07,
x2 evaluates to 0, and
tol_val evaluates to 1e-08.
expect_near_rel_finite in: expect_near_rel; require items x1(i) = x2(i): gradient() grad

./test/unit/math/expect_near_rel.hpp:33: Failure
The difference between x1 and x2 is 1.1190947340050207e-07, which exceeds tol_val, where
x1 evaluates to 1.1190947340050207e-07,
x2 evaluates to 0, and
tol_val evaluates to 1e-08.
expect_near_rel_finite in: expect_near_rel; require items x1(i) = x2(i): gradient() grad

./test/unit/math/expect_near_rel.hpp:33: Failure
The difference between x1 and x2 is 7.3171578761866729e-08, which exceeds tol_val, where

g++ info:

g++ -v
Using built-in specs.
COLLECT_GCC=g++-10
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 10-20200411-0ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none,amdgcn-amdhsa,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.0.1 20200411 (experimental) [master revision bb87d5cc77d:75961caccb7:f883c46b4877f637e0fa5025b4d6b5c9040ec566] (Ubuntu 10-20200411-0ubuntu1)

andrjohns commented 4 years ago

Do they pass for you under Clang or was that just a quirk of my system?

andrjohns commented 4 years ago

Figured this out (near-zero problem like Steve suspected). In the tests, the failures are caused by the very last pair of variables. This means that errors can be reproduced with just:

  Eigen::VectorXd in1(3);
  in1 << 5.2;
  Eigen::VectorXd in2(3);
  in2 << 6.7;
  stan::test::expect_ad_vectorized_binary(f, in1, in2);

To test whether this was an issue with the framework, I put together a simple test with var and fvar for this combination of variables (code at end), and all tests passed. Based on this, I would assume that the issue is actually in the finite-differencing not having a great resolution around zero (or a great resolution for arguments that result in near-zero). There is likely some math implementation difference between clang++ and g++ that gives slightly better resolution in this case (did not dig down that far though).

Code for testing values/adjoints:

#include <stan/math/fwd.hpp>
#include <stan/math.hpp>
#include <gtest/gtest.h>
#include <test/unit/util.hpp>

TEST(MathMatrixRevMat, EltPow) {
  using stan::math::vector_v;
  using stan::math::var;

  vector_v l1(1);
  l1 << 5.2;
  vector_v l2(1);
  l2 << 6.7;

  var com_l1 = l1[0];
  var com_l2 = l2[0];

  vector_v out = stan::math::pow(l1,l2);
  stan::math::var com_out = stan::math::pow(com_l1,com_l2);

  EXPECT_FLOAT_EQ(out[0].val(), com_out.val());

  com_out.grad();
  out[0].grad();

  EXPECT_FLOAT_EQ(l1[0].adj(), com_l1.adj());
  EXPECT_FLOAT_EQ(l2[0].adj(), com_l2.adj());
}

TEST(AgradFwdMatrix, EltPow) {
  using stan::math::vector_fd;

  vector_fd l1(1);
  l1 << 5.2;
  vector_fd l2(1);
  l2 << 6.7;

  stan::math::fvar<double> com_l1 = l1[0];
  stan::math::fvar<double> com_l2 = l2[0];

  vector_fd out = stan::math::pow(l1, l2);
  stan::math::fvar<double> com_out = stan::math::pow(com_l1, com_l2);

  EXPECT_FLOAT_EQ(out[0].val(), com_out.val());
  EXPECT_FLOAT_EQ(out[0].d_, com_out.d_);
}

andrjohns commented 4 years ago

Given this, I'll just open a PR to change these test values to something that the finite diffs can handle a bit better

SteveBronder commented 4 years ago

Word much appreciated! I forgot about this, as I was about to fall asleep tonight I went, "omg does exponential still not work???". Lol glad to see it was just a testing thing and not an actual scary thing

andrjohns commented 4 years ago

Ha yeah I had the panic thought that the vectorisation framework had been broken all along, was relieved to find it wasn't the case

stan-dev / math

pow tests fail with -march=native -mtune=native flags and g++ #2086

Description

Current Version: