RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.36k stars 1.27k forks source link

MATLAB R2015a crash while running quadrotor example using drake-distro compiled from sources #1142

Closed edowson closed 9 years ago

edowson commented 9 years ago

Hi, I'm facing an issue where Matlab R2015a inexplicably crashes, while attempting to run the drake quadrotor example, using the latest version of the drake-distro (rigidbody branch) compiled from sources.

The current head commit id is commit 26697d890939dc009fb1953c6e53f4e470845029 Author: Russ Tedrake Date: Wed Jun 24 06:40:29 2015 -0400

makefile works for non-cygwin windows

My platform configuration is Mac OS X 10.10.4, Xcode 6.4, MATLAB R2015a, MacPorts 2.3.3, with all required dependencies installed.

If I try to run the same example using the precompiled binary package (drake-distro-0.9.8 for Mac) with MATLAB R2015a, it works as expected.

I have attached the steps that I follow for the installation below:

cd /project/robotics/library/ git clone https://github.com/RobotLocomotion/drake-distro.git --recursive -b rigidbody

Install required dependencies: $ sudo port install \ xorg \ xorg-libX11 +quartz\ fontconfig \ xft2 \ libiconv \ gmp \ gcc48 +gfortran\ tcl \ tk \ graphviz +tcl \ iverilog \ gtkwave

Set locale.

Export the following variables to set the locale:

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

Update the symlinks to use the GCC compiler installed by MacPorts

Type the following command to configure the system to use the MacPorts GCC compiler suite:

$ sudo port select --set gcc mp-gcc48

Check the binary pointed to by the c++ and g++ command:

$ which c++ /opt/local/bin/c++

$ which g++ /opt/local/bin/g++

By default, this doesn't automatically create a symlink for /opt/local/bin/cc, so you'll have to manually create the symlink to point to gcc for the c-compiler:

$ cd /opt/local/bin $ sudo ln -s /opt/local/bin/gcc-mp-4.8 /opt/local/bin/cc

Check the binary pointed to by the cc command:

$ which cc /opt/local/bin/cc

$ sudo port install wget gsed gtk2 freeglut

Note 01 : You might get the following error: Package glu was not found in the pkg-config search path. Perhaps you should add the directory containing `glu.pc' to the PKG_CONFIG_PATH environment variable Package 'glu', required by 'bot2-vis', not found CMake Error at cmake/pods.cmake:300 (string): string sub-command STRIP requires two arguments. Call Stack (most recent call first): src/wavefront-viewer/CMakeLists.txt:7 (pods_use_pkg_config_packages)

This is because freeglut needs to be installed as a dependency for bot-2-vis

Note 02: You might get the following error:

BUILD_PREFIX: /project/robotics/library/drake-distro/build

The Fortran compiler identification is unknown CMake Error at CMakeLists.txt:4 (enable_language): No CMAKE_Fortran_COMPILER could be found.

Tell CMake where to find the compiler by setting either the environment variable "FC" or the CMake cache entry CMAKE_Fortran_COMPILER to the full path to the compiler, or to the compiler name if it is in the PATH.

This is because even after install gcc48 using macports, you need to configure the system to use the MacPorts GCC suite.

Download and install Java for OS X 2014-001 for Mac OS X https://support.apple.com/kb/DL1572?locale=en_US

Install pre-requisites

cd /project/robotics/library/drake-distro

sudo ./install_prereqs.sh macports

This will install gtk2 on Mac OS X 10.10.4, which will bring in a lot of packages and associated dependencies.

Build Drake and all dependent libraries: make

Regards,

Elvis Dowson

edowson commented 9 years ago

I made the following changes and was able to get an X11 Drake Viewer pop-up. MATLAB R2015a didn't crash this time though, but I couldn't see a quadrotor anywhere in the scene.

  1. For Matlab R2015a, modify the /bin/mexopts.sh script, and change all MAC_OS_X deployment targets to 10.10.
  2. Modify the /bin/maci64/java.opts file to set the JVM to default to IPV4 and increase the heap space:

Set JVM to use IPV4 stack, instead of IPV6.

-Djava.net.preferIPv4Stack=true

Set the JVM heap size.

-Xms256m -Xmx1024m

peteflorence commented 9 years ago

Try running runDircol twice in a row. Can you see a quadrotor now?

On Monday, July 6, 2015, Elvis Dowson notifications@github.com wrote:

I made the following changes and was able to get an X11 Drake Viewer pop-up. MATLAB R2015a didn't crash this time though, but I couldn't see a quadrotor anywhere in the scene.

1.

For Matlab R2015a, modify the /bin/mexopts.sh script, and change all MAC_OS_X deployment targets to 10.10. 2.

Modify the /bin/maci64/java.opts file to set the JVM to default to IPV4 and increase the heap space:

Set JVM to use IPV4 stack, instead of IPV6.

-Djava.net.preferIPv4Stack=true Set the JVM heap size.

-Xms256m -Xmx1024m

— Reply to this email directly or view it on GitHub https://github.com/RobotLocomotion/drake/issues/1142#issuecomment-118978845 .

edowson commented 9 years ago

I tried running runDricol twice, but each time the Drake Viewer briefly opened and closed. I ended up having a Matlab window with sliders and buttons to control the animation, but no viewer.

peteflorence commented 9 years ago

Okay. If you search old issues in this github repo, there is another issue where I posted instructions on Jvm related issues for Yosemite. It looks like you already modified java.opts but that issue report may still be helpful. I can look into more later

On Monday, July 6, 2015, Elvis Dowson notifications@github.com wrote:

I tried running runDricol twice, but each time the Drake Viewer briefly opened and closed. I ended up having a Matlab window with sliders and buttons to control the animation, but no viewer.

— Reply to this email directly or view it on GitHub https://github.com/RobotLocomotion/drake/issues/1142#issuecomment-118983227 .

edowson commented 9 years ago

There are a bunch of required pods missing from the rigidbody branch. e.g. snopt. Would this be relevant somehow?

Additionally, if I try to run the runMixedIntegerSimpleForest.m script, it complains saying Cannot find required dependency: lcmgl

peteflorence commented 9 years ago

What does matlab output after you run runDircol?

Is there an error message displayed in Matlab?

On Monday, July 6, 2015, Elvis Dowson notifications@github.com wrote:

There are a bunch of required pods missing from the rigidbody branch. e.g. snopt. Would this be relevant somehow?

Additionally, if I try to run the runMixedIntegerSimpleForest.m script, it complains saying Cannot find required dependency: lcmgl

— Reply to this email directly or view it on GitHub https://github.com/RobotLocomotion/drake/issues/1142#issuecomment-118986596 .

edowson commented 9 years ago

No, no errors in the MATLAB console. It just says launching drake_viewer, and outputs 4x1 PPTrajectory array with properties. A tool window with the title "Fig 89" remains open however.

psiorx commented 9 years ago

This is a long shot but did you try running

close all

before doing these?

I've fixed issues with figures not appearing or behaving strangely with that command.

edowson commented 9 years ago

I tried each test after restarting MATLAB.

RussTedrake commented 9 years ago

just read this. it sounds like your libbot viewer is crashing (closing), but matlab is fine. is that correct? (if so, I don't think it's the matlab side that's the problem)

try launching the drake_viewer from a terminal manually. e.g. with

linux-terminal% drake-distro/build/bin/drake_viewer 

then when it crashes, you'll be able to see any debug spew on the terminal window.

if that's still mysterious, then you can build the debug symbols and run the viewer from the terminal w/ e.g. gdb. it's a very simple app and should be easy to debug.

edowson commented 9 years ago

drake_viewer appears to run normally and does crash when I launch it separately from the command line.

./drake_viewer BOT_VIEWER_PRETTIER: 0 Using X Visual 0xf6

201507071520-drake-viewer

RussTedrake commented 9 years ago

sorry. after running drake_viewer manually, you should run the quadrotor example again in matlab. it will try to display in this viewer, and presumably crash it.

On Jul 7, 2015, at 7:25 AM, Elvis Dowson notifications@github.com wrote:

drake_viewer appears to run normally and does crash when I launch it separately from the command line.

./drake_viewer BOT_VIEWER_PRETTIER: 0 Using X Visual 0xf6

— Reply to this email directly or view it on GitHub.

edowson commented 9 years ago

ok, drake_viewer crashed when I tried to run the quadrotor example from Matlab (runDircol.m).

How do I setup just the drake pod to build with debug symbols? I'm familiar with plain cmake and catkin, but not the pod way of doing things. I'd like to invoke cmake -DCMAKE_BUILD_TYPE=Debug to specify a debug build, but I'm not sure which folder (/drake-distro/build) to invoke it from.

edowson commented 9 years ago

ok, found the command from the makefile make BUILD_TYPE=Debug

edowson commented 9 years ago

This is the initial error that I get while running it with gdb

Using X Visual 0xf6 [New Thread 0x172b of process 545] loading new robot with 2 links

Program received signal SIGSEGV, Segmentation fault. 0x0000000101369240 in std::string::_Rep::_M_grab(std::allocator const&, std::allocator const&) () from /opt/local/lib/libgcc/libstdc++.6.dylib (gdb) s Single stepping until exit from function _ZNSs4_Rep7_MgrabERKSaIcES2, which has no line number information.

RussTedrake commented 9 years ago

In the drake folder, run

make clean

(just for good measure) then

make BUILD_TYPE=Debug

On Jul 7, 2015, at 7:47 AM, Elvis Dowson notifications@github.com wrote:

ok, drake_viewer crashed when I tried to run the quadrotor example from Matlab (runDircol.m).

How do I setup just the drake pod to build with debug symbols? I'm familiar with plain cmake and catkin, but not the pod way of doing things. I'd like to invoke cmake -DCMAKE_BUILD_TYPE=Debug to specify a debug build, but I'm not sure which folder (/drake-distro/build) to invoke it from.

— Reply to this email directly or view it on GitHub.

RussTedrake commented 9 years ago

Great. The viewer code is super simple. If you can give me a backtrace, we can fix

On Jul 7, 2015, at 8:40 AM, Elvis Dowson notifications@github.com wrote:

This is the initial error that I get while running it with gdb

Using X Visual 0xf6 [New Thread 0x172b of process 545] loading new robot with 2 links

Program received signal SIGSEGV, Segmentation fault. 0x0000000101369240 in std::string::_Rep::_M_grab(std::allocator const&, std::allocator const&) () from /opt/local/lib/libgcc/libstdc++.6.dylib

— Reply to this email directly or view it on GitHub.

edowson commented 9 years ago

Program received signal SIGSEGV, Segmentation fault. 0x0000000101369240 in std::string::_Rep::_M_grab(std::allocator const&, std::allocator const&) () from /opt/local/lib/libgcc/libstdc++.6.dylib (gdb) backtrace full

0 0x0000000101369240 in std::string::_Rep::_M_grab(std::allocator const&, std::allocator const&) ()

from /opt/local/lib/libgcc/libstdc++.6.dylib No symbol table info available.

1 0x0000000101369273 in std::string::_Rep::_M_grab(std::allocator const&, std::allocator const&) ()

from /opt/local/lib/libgcc/libstdc++.6.dylib No symbol table info available.

2 0x000000010115e7cb in Mesh::Mesh(std::string, int, float_) ()

from /project/robotics/library/drake-distro/build/lib/libdrake_urdf_renderer.dylib No symbol table info available.

3 0x000000010115ef64 in LinkGeometry::LinkGeometry(_drake_lcmt_viewer_geometrydata const) ()

from /project/robotics/library/drake-distro/build/lib/libdrake_urdf_renderer.dylib No symbol table info available.

4 0x000000010115f457 in Link::Link(_drake_lcmt_viewer_linkdata const) ()

from /project/robotics/library/drake-distro/build/lib/libdrake_urdf_renderer.dylib No symbol table info available. ---Type to continue, or q to quit---

5 0x000000010115d602 in handle_lcm_viewer_load_robot(_lcm_recv_buft const, char const_, _drake_lcmt_viewer_loadrobot const, void*) () from /project/robotics/library/drake-distro/build/lib/libdrake_urdf_renderer.dylib

No symbol table info available.

6 0x00000001011a120b in drake_lcmt_viewer_load_robot_handler_stub ()

from /project/robotics/library/drake-distro/build/lib/libdrake_lcmtypes.dylib No symbol table info available.

7 0x0000000100bf5f01 in lcm_dispatch_handlers () from /project/robotics/library/drake-distro/build/lib/liblcm.1.dylib

No symbol table info available.

8 0x0000000100bf8266 in lcm_udpm_handle () from /project/robotics/library/drake-distro/build/lib/liblcm.1.dylib

No symbol table info available.

9 0x0000000100bf554f in lcm_handle () from /project/robotics/library/drake-distro/build/lib/liblcm.1.dylib

No symbol table info available.

10 0x0000000100b24aba in lcm_message_ready ()

from /project/robotics/library/drake-distro/build/lib/libbot2-core.1.dylib No symbol table info available. ---Type to continue, or q to quit---

11 0x0000000100c49728 in g_main_context_dispatch () from /opt/local/lib/libglib-2.0.0.dylib

No symbol table info available.

12 0x0000000100c49a0b in g_main_context_iterate () from /opt/local/lib/libglib-2.0.0.dylib

No symbol table info available.

13 0x0000000100c49c55 in g_main_loop_run () from /opt/local/lib/libglib-2.0.0.dylib

No symbol table info available.

14 0x00000001001ffddd in gtk_main () from /opt/local/lib/libgtk-x11-2.0.0.dylib

No symbol table info available.

15 0x000000010000cc23 in main (argc=1, argv=0x7fff5fbff688)

at /project/robotics/library/drake-distro/drake/systems/plants/viewer/main.cpp:57
    viewer = 0x105009090
    eye = {0, -4, 2}
    lookat = {0, 0, 0}
    fname = 0x10000cf12 ".viewer-prefs"
    viewer_title = {static npos = <optimized out>, 

---Type to continue, or q to quit--- _M_dataplus = {<allocator> = {<new_allocator> = {}, }, _M_p = 0x10360c728 "Drake Viewer"}} lcm = 0x104801fd0 pwidget = 0x104076eb0 up = {0, 0, 1}

RussTedrake commented 9 years ago

looks like this is probably related to the recent conversion to spruce in the mesh path handling. @psiorx -- can you take a look.

psiorx commented 9 years ago

Looking into this. It does look like it could be related to the spruce conversion but I can't reproduce it on my end.

Could you make sure that this file exists?

/project/robotics/library/drake-distro/drake/examples/Quadrotor/quadrotor_base.obj

The code in question is lines 100-110 in drake_urdf_renderer.cpp

On first inspection, it doesn't appear to be doing anything dangerous. It might have something to do with how spruce behaves on Mac OS.

I'll keep poking around.

edowson commented 9 years ago

I can confirm that the quadrotor_base.obj file exists. I even tried running it again, after copying over the obj file from the drake precompiled 0.9.8 release (which runs fine by the way), but the source build still crashes. So I guess the obj file is ok.

psiorx commented 9 years ago

I just pushed a potential fix for this. Can you please try this out to see if it resolves your issue?

If it's still occurring, it contains some useful printouts that should help us narrow the problem down further.

https://github.com/RobotLocomotion/drake/pull/1145

edowson commented 9 years ago

Could you push the changes to the rigidbody branch, please? As I understand, if I work on the master branch, I won't be able to access some libraries that are only available to the robotics locomotion group.

psiorx commented 9 years ago

You should only need to apply this change to the drake submodule in whichever drake-distro version you are using(rigidbody in your case). The following should do it:

cd drake
git remote add psiorx git@github.com:psiorx/drake.git
git fetch psiorx
git checkout jc-debug-spruce
git submodule update
make -j
edowson commented 9 years ago

I got the following error: Permission denied (publickey). fatal: Could not read from remote repository.

I'll try to patch the file manually..

edowson commented 9 years ago

addpath_pods addpath_drake runDircol

Cannot find required pod snopt Error using PPTrajectory The specified superclass 'Trajectory' contains a parse error, cannot be found on MATLAB's search path, or is shadowed by another file with the same name. Error in runDircol (line 29) traj_init.x = PPTrajectory(foh([0,tf0],[double(x0),double(xf)]));

psiorx commented 9 years ago

That error seems unrelated to the visualizer issue above. Did the patch fix it? runLQR should be more forgiving in terms of dependencies since it doesn't require snopt.

edowson commented 9 years ago

Ok, the patch appears to have fixed the issue. I can now see a quadrotor in drake viewer after running the runLQR command.

However, runLQR also crashes with the following error:

addpath_pods Adding /project/robotics/library/drake-distro/build/matlab to the matlab path addpath_drake runDircol Cannot find required pod snopt Error using PPTrajectory The specified superclass 'Trajectory' contains a parse error, cannot be found on MATLAB's search path, or is shadowed by another file with the same name. Error in runDircol (line 29) traj_init.x = PPTrajectory(foh([0,tf0],[double(x0),double(xf)]));

A few new patches pulled in with the recent pull from the drake-distro repository, which have caused this error.

psiorx commented 9 years ago

It looks like the paths are not getting set up correctly.

I would try the following:

  1. Restart matlab in the drake-distro folder
  2. addpath_pods

  3. cd drake

  4. addpath_drake

  5. cd examples/Quadrotor

  6. runLQR

edowson commented 9 years ago

I tried that but I still get the same error:

Cannot find required pod snopt Error using PPTrajectory The specified superclass 'Trajectory' contains a parse error, cannot be found on MATLAB's search path, or is shadowed by another file with the same name. Error in DynamicalSystem/simulate/makeSubTrajectory (line 163) traj = PPTrajectory(spline(t,y)); Error in DynamicalSystem/simulate (line 209) ypptraj = {ypptraj{:},makeSubTrajectory(t([max(inds(1)-1,1),inds(2:end)]),y(:,inds))}; % use time of zc instead of 1e-10 past it. Error in DrakeSystem/simulate (line 489) [varargout{:}] = simulate@DynamicalSystem(obj,varargin{:}); Error in runLQR (line 26) xtraj = simulate(sys,[0 4],double(x0)+[.5*randn(6,1);zeros(6,1)]);

psiorx commented 9 years ago

Ah, I see. It looks like snopt is required even for the runLQR example. That particular pod is not included in the rigidbody distro.

edowson commented 9 years ago

what do I do now?

psiorx commented 9 years ago

You can try installing the student version of snopt. It's free but has a limit on the size of the problems you can solve with it.

http://ccom.ucsd.edu/~optimizers/downloads.php

I believe there's also a way to configure NonlinearProgram to use matlabs 'fmincon' function instead of snopt. Take a look at the setSolver method.

RussTedrake commented 9 years ago

snopt should not be required. it’s getting called from NonlinearProgram which should fall back to fmincon if you have that.

On Jul 7, 2015, at 3:26 PM, John Carter notifications@github.com wrote:

You can try installing the student version of snopt. It's free but has a limit on the size of the problems you can solve with it.

— Reply to this email directly or view it on GitHub.

RussTedrake commented 9 years ago

and this error is not an error with snopt. the optimization ran, but something is wrong with PPTrajectory. My guess is that there is a merge conflict or something corrupting that file?

On Jul 7, 2015, at 3:14 PM, Elvis Dowson notifications@github.com wrote:

I tried that but I still get the same error:

Cannot find required pod snopt Error using PPTrajectory The specified superclass 'Trajectory' contains a parse error, cannot be found on MATLAB's search path, or is shadowed by another file with the same name. Error in DynamicalSystem/simulate/makeSubTrajectory (line 163) traj = PPTrajectory(spline(t,y)); Error in DynamicalSystem/simulate (line 209) ypptraj = {ypptraj{:},makeSubTrajectory(t([max(inds(1)-1,1),inds(2:end)]),y(:,inds))}; % use time of zc instead of 1e-10 past it. Error in DrakeSystem/simulate (line 489) [varargout{:}] = simulate@DynamicalSystem(obj,varargin{:}); Error in runLQR (line 26) xtraj = simulate(sys,[0 4],double(x0)+[.5*randn(6,1);zeros(6,1)]);

— Reply to this email directly or view it on GitHub.