ARCC-RACE / deepracer-for-dummies

a quick way to get up and running with local deepracer training environment
66 stars 28 forks source link

Gazebo shutting down at start #56

Open JustinGuese opened 4 years ago

JustinGuese commented 4 years ago

Hey guys, first things first - great work! Second - My Gazebo VNC screen crashes down shortly after the start. It seems like the car starts at the corner of the track (off the street), drives for 1 second, disappears and after that Gazebo crashes. I tried reinstalling the NVIDIA cuda drivers and everything, and it is a fresh setup of Ubuntu 18.04. Did you ever experience a similar error?

Attached are the logs of the container:

python3 -m markov.rollout_worker /app/robomaker-deepracer/simulation_ws/install/deepracer_simulation/lib/deepracer_simulation/run_rollout_rl_agent.sh: line 8: 1660 Illegal instruction (core dumped) python3 -m markov.rollout_worker ================================================================================REQUIRED process [agent-9] has died! process has died [pid 1493, exit code 132, cmd /app/robomaker-deepracer/simulation_ws/install/deepracer_simulation/lib/deepracer_simulation/run_rollout_rl_agent.sh name:=agent log:=/root/.ros/log/dd04facc-e6fc-11e9-98a8-0242ac120004/agent-9.log]. log file: /root/.ros/log/dd04facc-e6fc-11e9-98a8-0242ac120004/agent-9*.log Initiating shutdown!

[ INFO] [1570230936.370759356]: Finished loading Gazebo ROS API Plugin. [ INFO] [1570230936.384992129]: waitForService: Service [/gazebo/set_physics_properties] has not been advertised, waiting... [ INFO] [1570230938.103051125, 0.033000000]: waitForService: Service [/gazebo/set_physics_properties] is now available. [ INFO] [1570230938.834061301, 0.748000000]: Physics dynamic reconfigure ready. [racecar/controller_manager-5] escalating to SIGTERM [WARN] [1570230962.581206, 5.927000]: Controller Spawner error while taking down controllers: transport error completing service call: receive_once[/racecar/controller_manager/switch_controller]: unexpected error [Errno 4] Interrupted system call [gazebo-2] escalating to SIGTERM

JustinGuese commented 4 years ago

This might be related to an issue of Chris Rhodes former project (quote)

@zibjdp: Digging around on the internet revealed that tensorflow from v1.6.0 is prebuilt with AVX instructions available in modern CPUs. https://stackoverflow.com/questions/49122044/illegal-instruction-when-import-tensorflow-in-python https://github.com/tensorflow/tensorflow/releases/tag/v1.6.0

checking my CPU features with 'cat /proc/cpuinfo' showed that my Intel i7 920 (released 2008) does not have the AVX instructions. https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX If your CPU model was released before 2011, then it will not support AVX and will fail with the version of Tensorflow(1.11.0) in the robomaker container.

It looks like there is a way to compile newer versions of tensorflow without AVX support. But I have not tried that option yet.

Hope this helps other users getting the same Illegal instruction error