arpg / HAL

Hardware Abstraction Layer library.
Other
36 stars 24 forks source link

Calibu dependency problem on Ubuntu #130

Closed lucieb closed 8 years ago

lucieb commented 8 years ago

I am setting up a vicalib installation on Ubuntu 16.04 using gcc 5.3.1.

I keep running into this error whenever I try to build SensorViewer from arpg_apps

//usr/local/lib/libhal.so: undefined reference to `calibu::LinearCamera::NumParams'

The call is the following, I don't see anything wrong with it:

/usr/bin/c++ -std=c++11 -Wall -Wextra CMakeFiles/SensorViewer.dir/main.cpp.o -o SensorViewer -rdynamic -lGLU -lGL -lGLEW -lglut -lpng -lz -ljpeg -ltiff -lpangolin -ltinyxml2 -lopencv_core -lopencv_calib3d -lcalibu -ldc1394 -lraw1394 /usr/lib/x86_64-linux-gnu/libopencv_core.so.2.4.9 /usr/lib/x86_64-linux-gnu/libopencv_imgproc.so.2.4.9 -lusb-1.0 /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9 -lprotobuf -lpthread /usr/local/lib/libglog.so -lhal -ltinyxml2 -lopencv_core -lopencv_calib3d -lcalibu -lprotobuf -ldl -ldc1394 -lraw1394 /usr/lib/x86_64-linux-gnu/libopencv_core.so.2.4.9 /usr/lib/x86_64-linux-gnu/libopencv_imgproc.so.2.4.9 -lusb-1.0 /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4.9 -lpthread /usr/local/lib/libglog.so -lhal -ldl -Wl,-rpath,/usr/local/lib

Thanks

crheckman commented 8 years ago

I am guessing there are other undefined references, which would mean the calibu library is not being found, or if this is the only one error, then maybe the code needs to be pulled again? Are you on master branches of all the code?

https://github.com/arpg/Documentation/tree/master/Installation

Algomorph commented 8 years ago

Sure, but I was on Ubuntu 15.10 when I last did it. I've also made the rational 6-coefficient model contribution. What's up?

On 06/14/2016 05:14 PM, Christoffer Heckman wrote:

Have you build arpg/Calibu ?

https://github.com/arpg/Documentation/tree/master/Installation

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arpg/HAL/issues/130#issuecomment-226018401, or mute the thread https://github.com/notifications/unsubscribe/ABAH7jie5tqKQZ2XqbStwA6aGVyI7Vkrks5qLxm8gaJpZM4IuB-X.

lucieb commented 8 years ago

I am on the master branches using the latest versions of all dependencies. Calibu, is installed in /usr/local/lib/libcalibu.so. The vicalib cmake configuration outputs:

-- Found required Ceres dependency: Eigen version 3.2.92 in /usr/include/eigen3 -- Found required Ceres dependency: Glog in /usr/local/include -- Ceres version 1.12.0 detected here: /usr/local was built with C++11. Ceres target will add C++11 flags to compile options for targets using it. -- Found Ceres version: 1.12.0 installed in: /usr/local with components: [EigenSparse, SparseLinearAlgebraLibrary, LAPACK, SuiteSparse, SchurSpecializations, C++11, OpenMP] -- Configuring done -- Generating done -- Build files have been written to: /home/perclab/Documents/Calibration/vicalib/build

and the Calibu_DIR variable is set to /usr/local/lib/cmake/Calibu.

Algomorph commented 8 years ago

I figured out the problem. This is due to an elusive bug introduced by yours truly. Here is the discussion of the general problem on SO. I had assumed that I can use the static constexpr in HAL the same way as it's used in vicalib. I was wrong. I guess GCC behavior changed a bit in version 5, but it's still performing to spec. I also guess the reason it hasn't come up until now is that other compilers do define these constants at compile-time and store them somewhere in the HAL binary.

The "hacky" solution is to simply change line 27 in HAL/Camera/Drivers/UndistortDriver.cpp from Eigen::VectorXd params_(calibu::LinearCamera<double>::NumParams); to Eigen::VectorXd params_(4);, recompile & reinstall HAL, then recompile vicalib & everything else using HAL. I have no idea how to fix this otherwise. A bit of a design issue with Calibu...

lucieb commented 8 years ago

It did the trick. Thanks.

Could you tell me which version of GCC and which version of Eigen you are using? I am running into standard declarations issues in Sophu when trying to re-compile vicalib.

crheckman commented 8 years ago

I will try and track down a more general solution than hard coding the number of parameters. Thanks for commenting, Greg!

Lucie, could you paste your errors?

On Tue, Jun 14, 2016, 8:50 PM Lucie Bélanger notifications@github.com wrote:

It did the trick. Thanks.

Could you tell me which version of GCC and which version of Eigen you are using? I am running into standard declarations issues in Sophu when trying to re-compile vicalib.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/arpg/HAL/issues/130#issuecomment-226074693, or mute the thread https://github.com/notifications/unsubscribe/AIl6Bc5ugyZBYvXynUa4SdvxEtWoWEthks5qL2hmgaJpZM4IuB-X .

lucieb commented 8 years ago

Attached is the whole compile output log. error_log_2016-06-14.txt

Algomorph commented 8 years ago

@lucieb, I'm on GCC 5.4.1 and eigen 3.3 beta 1, which are standard on most current Ubuntu.

lucieb commented 8 years ago

I made sure to use eigen 3.3-beta1 but I cannot use gcc 5.4.1 as the latest version on Ubuntu 16.04 (LTS) is gcc 5.3.1. I get the same errors as before.

Algomorph commented 8 years ago

@lucieb , I just tried it with GCC 5.3.1 and was unable to reproduce the issue. I'm on latest Sophus commit to date. Try make clean, deleting CMake cache from cmake-gui, re-running it, and making sure you have /usr/bin/c++ as your CMAKE_CXX_COMPILER, also in the cmake-gui, then build again. That's the only thing I can think of right now.

Algomorph commented 8 years ago

@lucieb , actually, what's your ceres version? I'm on 1.10.0, which is somewhat older than the latest.

lucieb commented 8 years ago

I reverted ceres-solver to version 1.10.0. I re-tested with eigen 3.3 beta 1 and the latest stable release 3.2.8. I still get the same errors:

~/Documents/Calibration/Sophus/sophus/so3.hpp:362:29: error: no matching function for call to ‘cos(const Scalar&)’
       real_factor = std::cos(half_theta);
                             ^
In file included from /usr/include/features.h:367:0,
                 from /usr/include/time.h:27,
                 from ~/Documents/Calibration/vicalib/src/vicalib-engine.cc:4:
/usr/include/x86_64-linux-gnu/bits/mathcalls.h:63:1: note: candidate: double cos(double)
 __MATHCALL_VEC (cos,, (_Mdouble_ __x));
~/Documents/Calibration/vicalib/src/vicalib-engine.cc:581:1:   required from here
/usr/include/c++/5/cmath:215:5: error: no type named ‘__type’ in ‘struct __gnu_cxx::__enable_if<false, double>’
In file included from /usr/local/include/eigen3/Eigen/Core:28:0,
                 from /usr/local/include/calibu/utils/StreamOperatorsEigen.h:5,
                 from /usr/local/include/calibu/cam/camera_xml.h:27,
                 from ~/Documents/Calibration/vicalib/src/vicalib-engine.cc:11:
~/Documents/Calibration/Sophus/sophus/so3.hpp: In instantiation of ‘static const Sophus::SO3Group<typename Eigen::internal::traits<T>::Scalar> Sophus::SO3GroupBase<Derived>::expAndTheta(const Tangent&, Sophus::SO3GroupBase<Derived>::Scalar*) [with Derived = Sophus::SO3Group<ceres::Jet<double, 35>, 0>; typename Eigen::internal::traits<T>::Scalar = ceres::Jet<double, 35>; Sophus::SO3GroupBase<Derived>::Tangent = Eigen::Matrix<ceres::Jet<double, 35>, 3, 1, 0, 3, 1>; Sophus::SO3GroupBase<Derived>::Scalar = ceres::Jet<double, 35>]’:
~/Documents/Calibration/Sophus/sophus/so3.hpp:330:23:   required from ‘static const Sophus::SO3Group<typename Eigen::internal::traits<T>::Scalar> Sophus::SO3GroupBase<Derived>::exp(const Tangent&) [with Derived = Sophus::SO3Group<ceres::Jet<double, 35>, 0>; typename Eigen::internal::traits<T>::Scalar = ceres::Jet<double, 35>; Sophus::SO3GroupBase<Derived>::Tangent = Eigen::Matrix<ceres::Jet<double, 35>, 3, 1, 0, 3, 1>]’
mcguire-steve commented 8 years ago

The Scalar type being used is a Jet to support auto-differentiation. The impact is that the stdc++ library does not know about Jets, and therefore the call to std::cos() fails type checking. Can you try removing the std:: namespace specifier? Ceres provides replacements for the standard math functions that are Jet-aware. Same thing for abs().

lucieb commented 8 years ago

Yes, removing the std:: namespace on sqrt, abs, sin and cos solved the problem.

I had to fix a typo in the Pangolin CMakeLists.txt files. It's using find_package(CVARS 2.3 QUIET) instead of find_package(CVars 2.3 QUIET).

Thank you for your help.

crheckman commented 8 years ago

Please submit a PR for your changes if you could.

On Tue, Jun 21, 2016, 5:39 PM Lucie Bélanger notifications@github.com wrote:

Yes, removing the std:: namespace on sqrt, abs, sin and cos solved the problem.

I had to fix a typo in the Pangolin CMakeLists.txt files. It's using find_package(CVARS 2.3 QUIET) instead of find_package(CVars 2.3 QUIET).

Thank you for your help.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/arpg/HAL/issues/130#issuecomment-227604131, or mute the thread https://github.com/notifications/unsubscribe/AIl6BUA4e0KP-J6gyuzo_4yywJuOxd8dks5qOHY2gaJpZM4IuB-X .

crheckman commented 8 years ago

@lucieb I don't think removing std:: from abs will work in general since the compiler may choose (and on my system, did choose) to use the integer abs rather than the floating point abs.

Sophus should now be compatible with jet-type autodiff. Pangolin's CMakeLists.txt file has been updated to address these issues. The only outstanding item is resolving GCC's behavior when it handles the static constexpr which defines how many params are used in the LinearCamera model.

JStech commented 8 years ago

Isn't that what fabs is for?

On Tue, Jul 26, 2016 at 7:51 AM Christoffer Heckman < notifications@github.com> wrote:

@lucieb https://github.com/lucieb I don't think removing std:: from abs will work in general since the compiler may choose (and on my system, did choose) to use the integer abs rather than the floating point abs.

Sophus should now be compatible with jet-type autodiff. Pangolin's CMakeLists.txt file has been updated to address these issues. The only outstanding item is resolving GCC's behavior when it handles the static constexpr which defines how many params are used in the LinearCamera model.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arpg/HAL/issues/130#issuecomment-235292622, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfsZMGnM7KrhPyceRxT2kYOeGZB9Hn3ks5qZh8CgaJpZM4IuB-X .

crheckman commented 8 years ago

"It's a mess." http://stackoverflow.com/questions/21392627/abs-vs-stdabs-what-does-the-reference-say

crheckman commented 8 years ago

I want to kill this bug and close the topic, which has already touched on a couple of issues.

@mcguire-steve when you get the chance could you please try building arpg_apps and checking out the bug that @Algomorph suggested a solution to on June 14? It seems the problem is gcc-specific. I would prefer trying to use the NumParams provided with LinearCamera and I am surprised that it is not available at compile time; I would expect the templates to be translated into binary in the HAL library when the methods are invoked in UndistortDriver.cpp.

mcguire-steve commented 8 years ago

Investigation reveals that (at least on my setup) GCC embeds the LinearCamera::NumParams as an immediate value and does not allocate memory for the const, which means that a symbol will not be defined for the value.

I added the following snippet within the UndistortDriver constructor as a marker to find the right place in the ~32k lines of assembly that comprise the UndistortDriver:

Eigen::Vector2i size_;
printf("Num params: %d\n", calibu::LinearCamera<double>::NumParams);   
Eigen::VectorXd params_(calibu::LinearCamera<double>::NumParams);

Stopping the compilation at assembly (on x86_64) yields:

.LEHB8:
    call    _ZN5Eigen6MatrixIiLi2ELi1ELi0ELi2ELi1EEC1Ev@PLT
    movl    $4, %esi
    leaq    .LC5(%rip), %rdi
    movl    $0, %eax
    call    printf@PLT
    leaq    -720(%rbp), %rax
    movl    $4, %esi
    movq    %rax, %rdi
    call    _ZN5Eigen6MatrixIdLin1ELi1ELi0ELin1ELi1EEC1El@PLT

The LC5 constant is the string 'Num Params%d\n' to be fed into the printf call. The thing to note is the movl $4, %esi instructions: the immediate value $4 corresponds to the constexpr LinearCamera<double>::NumParams in both cases.

This makes sense, since the constant is only ever used in a value form. However, if the constant were to be passed into a function that expected a reference, one would get the undefined symbol error as a result. I was able to force this error given the following minimum viable example:

#include <stdio.h>

class Oven
{
public:
  static int static_ovenType;
  Oven()
  {
    static_ovenType = 0;
  }
};

class CoffeeRoaster : public Oven
{
public:
  static constexpr int ovenType = 0xc0ffee;
};

class MeatRoaster : public Oven
{
public:
  static constexpr double ovenType = 0.1;
};

int changeParams(int *input)
{
  int myVal = 3;
  return myVal;
}

int main(int argc, char** argv)
{
  //Oven theOven;
  //printf("Oven type: 0x%x\n", theOven.static_ovenType);

  //CoffeeRoaster coffeeOven;
  printf("Coffee Oven type: 0x%x\n", CoffeeRoaster::ovenType);

  //printf("Address of coffee: 0x%x\n", &CoffeeRoaster::ovenType);

  //MeatRoaster meatOven;
  printf("Meat Oven type: %f\n", MeatRoaster::ovenType);

  //Check the reference thing

  printf("My function: %d\n", changeParams((int*)&CoffeeRoaster::ovenType));
  return 0;

}

Note that this result is compiler dependent - using gcc vs clang on a toy problem yields different results, particularly when the constexpr is a float: gcc uses the movabsq to embed an 8-byte immediate (i.e. a double) into the instruction, rather than allocate memory, fill memory, and use some form of indirect addressing to do the same thing (clang does this).

All of this being said, I haven't been able to replicate the error in SensorViewer using the Undistort driver that uses LinearCamera::NumParams instead of a deliberate constant. This issue exposes a failure mode in that the error is produced by not preserving the ability of the compiler to represent NumParams as an immediate value in code downstream of the Calibu library.