Closed osrf-migration closed 5 years ago
Original comment by Sarah Kitchen (Bitbucket: snkitche).
Upgrading this to blocker. This is not always an issue, but is frequently an issue when launching multiple agents. Here is a screenshot of htop after killing rosmaster, closing and reopening terminal. No ignition or ros processes were listed with ps -X when I took this.
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Can you post the exact command you are using to launch these multiple agents?
Original comment by Sarah Kitchen (Bitbucket: snkitche).
Well, this has happened with many different launch commands. I don’t recall which one I used before getting that screenshot. Here is what I’ve been trying to run today (with the same problem):
ign launch -v 4 virtual_stix.ign robotName1:=X3 robotConfig1:=X1_SENSOR_CONFIG_1 robotName2:=X2 robotConfig2:=X1_SENSOR_CONFIG_1 robotName3:=X1 robotConfig3:=X1_SENSOR_CONFIG_4
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Sarah Kitchen (snkitche) We will look into this. In the meantime, can you use the competition.ign
instead of virtual_stix.ign
?
ign launch -v 4 competition.ign robotName1:=X3 robotConfig1:=X1_SENSOR_CONFIG_1 robotName2:=X2 robotConfig2:=X1_SENSOR_CONFIG_1 robotName3:=X1 robotConfig3:=X1_SENSOR_CONFIG_4
Original comment by Sarah Kitchen (Bitbucket: snkitche).
This happens just as much with competition.ign. At first, I thought it was due to the logging and/or dynamic loading, but since it happens with virtual_stix.ign as well, this seems to not be the issue. I’m having this issue today with the command
ign launch -v 4 competition.ign robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG_4
Original comment by Nate Koenig (Bitbucket: Nathan Koenig).
Can you post your ign-gazebo version using dpkg -l | grep ignition
?
Original comment by Sarah Kitchen (Bitbucket: snkitche).
ii ignition-blueprint 1.0.0-1~bionic
ii ignition-gazebo2 2.2.0-1~bionic
ii ignition-tools:amd64 0.2.0-1~bionic
Let me know if you want to see any of the libignition versions.
Original comment by Nate Koenig (Bitbucket: Nathan Koenig).
Those look good. How about ros-melodic-ros1-ign-bridge
?
Original comment by Derek Knowles (Bitbucket: dknowles-ssci).
This also frequently happens to me when I run competition.ign
.
ignition-blueprint 1.0.0-1~bionic
ignition-gazebo2 2.2.0-1~bionic
ignition-tools:amd64 0.2.0-1~bionic
ros-melodic-ros1-ign-bridge 0.3.1-1bionic
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Sarah, Derek,
The same was happening to me last week, but I cannot reproduce it anymore after rebuilding my workspace. Can you run the commands below and try it again?
cd ~/.ignition/fuel/fuel.ignitionrobotics.org/openrobotics/models/
rm -rfv *
cd ~/subt_ws/src/tunnel_circuit
hg pull && hg up
cd ~/subt_ws
catkin_make install
ign launch -v 4 competition.ign robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG_2
NOTE: The GUI will present you with empty panels for a few minutes until the models are downloaded.
Original comment by Sarah Kitchen (Bitbucket: snkitche).
Nate:
ii ros-melodic-ros1-ign-bridge 0.3.1-1bionic
Alfredo, I’ve followed your update instructions. It will take a couple runs before I can see if I’m still having an issue. Will update this comment when I can tell.
The problem persists. New screenshot attached with a sorted tree. Before trying to kill anything, these processes are under a bash (which is under systemd).
Launch command:
ign launch -v 4 competition.ign robotName1:=X3 robotConfig1:=X1_SENSOR_CONFIG_1 robotName2:=X2 robotConfig2:=X1_SENSOR_CONFIG_1 robotName3:=X1 robotConfig3:=X1_SENSOR_CONFIG_4
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Thank you, Sarah for testing this. Can you also please attach the .log file created in your home directory when this occurred?
Original comment by Sarah Kitchen (Bitbucket: snkitche).
I have attached the directory from /home/snkitche/.ros/log that I believe corresponds to the above info.
The log files created by competition.ign in /home/snkitche (starting with subt_tunnel_qual) appear to be empty (0 bytes). I have also saved off some console output from a different set of runs, but am not sure entirely what is in there. The issue of ImageDisplay causing a seg fault happened again in that set.
Original comment by Hector Escobar (Bitbucket: hector_escobar).
I get the same problem and if I use “top” to view the processes, usually there are several parameter_bridge not stopped. I’ve been using killall parameter_bridge to kill them and then all seem to die. Sometimes I also have to kill ign.
Original comment by Michael Carroll (Bitbucket: Michael Carroll).
I have managed to reproduce this behavior. Will work on a fix.
Original comment by Derek Knowles (Bitbucket: dknowles-ssci).
Has anybody found a good temporary solution until this issue is fixed? I’m currently having to run this after nearly every time I shutdown competition.ign
killall rosmaster roslaunch parameter_bridge ukf_localization_node roll_pitch_yawrate_thrust_controller_node
kill -9 $(pgrep ign)
Original comment by Neil Johnson (Bitbucket: realdealneil1980).
I generally have the same problem. I installed the catkin_ws version of ignition gazebo on Wednesday of this week. The simulator is working in general, but I have to kill processes like Derek mentions above. Sometimes even that doesn’t seem to be enough, and I have to reboot the computer to get the vehicles to launch again. The problem seems worst when I close the ignition window early on…if I’ve run the simulator for a few minutes, it sometimes doesn’t happen.
Original comment by Michael Carroll (Bitbucket: Michael Carroll).
Currently, closing the GUI window does not terminate the rest of the simulation. To stop the simulation, use ctrl-c
in the terminal where ign launch
was started.
Original comment by Michael Carroll (Bitbucket: Michael Carroll).
At least one deadlock issue was introduced at shutdown via https://osrf-migration.github.io/subt-gh-pages/#!/osrf/subt/pull-requests/184 and was resolved via https://osrf-migration.github.io/subt-gh-pages/#!/osrf/subt/pull-requests/194/fixing-a-deadlock-i-introduced-in-the-base/diff
At this point, I can’t seem to reproduce this behavior locally, but since the issue was opened before #184 was merged, I can’t be confident that #184 was the only factor. If you continue to see issues after #194, please let me know, so that I can work on constructing a case that consistently reproduces the bug.
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Michael’s changes to fix the problem have been merged so update your local repo to grab them.
cd ~/subt_ws/src/tunnel_circuit
hg pull && hg up
cd ~/subt_ws catkin_make install
# To test:
ign launch -v 4 competition.ign robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG_2
Original comment by Derek Knowles (Bitbucket: dknowles-ssci).
Thanks for your work on this issue Michael Carroll (Michael Carroll)
I updated the subt repository as suggested and deleted the build/ devel/ install/ folders for good measure before running catkin_make install
.
I still occasionally have processes labeled /usr/bin/ruby /usr/bin/ign launch -v 4 competition.ign robotName1:=X4 robotConfig1:=X4_SENSOR_CONFIG_2
running even after I ctrl+C
and close the terminal where that command was run. Let me know if a log file would help.
ign processes are the only offenders I’ve noticed since your update. I haven’t seen any rosmaster roslaunch parameter_bridge ukf_localization_node roll_pitch_yawrate_thrust_controller_node
processes still running.
Original comment by Michael Carroll (Bitbucket: Michael Carroll).
Yes, in that case, logs would be very helpful. Full verbosity (which you already have). Feel free to make a gist and post them there so that we don’t flood the thread here.
Original comment by Derek Knowles (Bitbucket: dknowles-ssci).
This one actually had all those processes I mentioned except ign. I copied the contents from ~/.ros/log/latest/ here. Are there ignition specific logs I should add?
https://gist.github.com/betaBison/28e9ee9cd7984696537484b85a68f636
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Derek,
Can you also try with tunnel_circuit_practice.ign
instead of competition.ig
, and one of the practice tunnels?
ign launch -v 4 tunnel_circuit_practice.ign worldName:=tunnel_circuit_practice_01 robotName1:=X4 robotConfig1:=X4_SENSOR_CONFIG_2
Original comment by Derek Knowles (Bitbucket: dknowles-ssci).
Yes, I will try on Monday.
Here’s another with the ign process still running.
https://gist.github.com/betaBison/f2abcfcd8bf0295ed4ca8fb5558dd0b2
Original comment by Michael Carroll (Bitbucket: Michael Carroll).
We found a second deadlock that can affect the bringup portion of the process. In this case, ign-launch
will hang when launching processes, and can only be killed with SIGTERM
or higher. This ends up leaving a few of the residual processes around that we’ve been seeing. This PR (https://bitbucket.org/ignitionrobotics/ign-launch/pull-requests/36/eliminate-potential-deadlock-from-sigchld/diff) should address the deadlock on the way up.
Original comment by Sarah Kitchen (Bitbucket: snkitche).
Can you change this issue back to Open at least until that PR is done?
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
You can ether install the new Docker image by following the instructions in the link below:
`https://osrf-migration.github.io/subt-gh-pages/#!/osrf/subt/wiki/tutorials/SystemSetupDockerhub`
Or run the commands below to update your catkin environment.
sudo apt update && sudo apt upgrade -y
sudo reboot
cd ~/subt_ws/src/tunnel_circuithgpull && hg update tunnel_circuit
source /opt/ros/melodic/setup.bash
rm -rfv ~/.ignition/fuel/fuel.ignitionrobotics.org/openrobotics/models/*
cd ~/subt_ws/catkin_make install
. ~/subt_ws/install/setup.bash
ign launch -v 4 competition.ign robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG_2
Open another terminal and run these commands:
. /opt/ros/melodic/setup.bash
. ~/subt_ws/install/setup.bash
roslaunch subt_example teleop.launch
Original comment by Zbyněk Winkler (Bitbucket: Zbyněk Winkler (robotika)).
Today I was not able to quit with ctrl+c as well. I had to take down the docker container (sha256:1145790d83973b37bd4851e6e329d24de062a99e19f898e22438c9d3882c00a6). I am glad I didn’t run it locally as it would have been pain to kill all that stuff by hand.
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Are you using the latest
docker image from a couple of days ago?
Which Docker version are you running it?
Any errors reported?
Did it happen only once with that image?
Were you running your controller in a different container when it happened?
Which launch configuration did you use and how many robots?
Original comment by Martin Dlouhy (Bitbucket: robotikacz).
Well, as Zbynek mentioned in some other post “latest” is misleading (as it can change any time) and more precise as (sha256:1145790d83973b37bd4851e6e329d24de062a99e19f898e22438c9d3882c00a6) until you start versioning “releases” he cannot be
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Yes, I get the point about the version tag and new images will have that. However, notice that when issues are reported we need more that just the version tag or the sha256 number.
Original comment by Zbyněk Winkler (Bitbucket: Zbyněk Winkler (robotika)).
I just run the “./run.bash nkoenig/subt-virtual-testbed tunnel_circuit_practice.ign robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG_1” example and nothing else (no controller, no other container, just that one thing). I tried to cancel it while it was still starting (I think, its hard to tell when it is done).
Original comment by Alfredo Bencomo (Bitbucket: bencomo).
Hi Zbyněk,
Thank you for that info. Can you elaborate more regarding its hard to tell when it is done
?
Original comment by Zbyněk Winkler (Bitbucket: Zbyněk Winkler (robotika)).
It starts a lot things and it takes a lot of time (almost a minute) and there is no message in the output (or somewhere else?) that would say something along the lines “I am done loading and starting all the stuff, feel free to start up your controller”. Since I was just testing if the X window appears, I killed it quite early.
I had a script for gazebo9 that would wait for a certain topic to appear to take a guess when it is done loading.
Original report (archived issue) by Sarah Kitchen (Bitbucket: snkitche).
The original report had attachments: hangingprocesslogs.tar.gz
Trying to run a several configurations, different .ign files, etc. I’ve intermittently had trouble stopping processes. I Ctrl+C to exit, but find I have hanging processes when I htop, or I ps -X. Sometimes if I try to killall via pid, “no process found” is returned. Sometimes killall rosmaster has worked, but not always.