osrf / srcsim

Space Robotics Challenge
Other
9 stars 4 forks source link

Val can't stand up after update #98

Open osrf-migration opened 7 years ago

osrf-migration commented 7 years ago

Original report (archived issue) by Jedediyah Williams (Bitbucket: Jedediyah).


After updating, it looks like the controller does start and Val's joints go to the home position, but she doesn't stay up. I did a full reinstall but get the same behavior. I think I've tried all the solutions posted in the other issue threads.

Video of Val falling

Snippet of the output

This was happening before but I could disable networking during launch and for some reason that avoided the problem. However, if I disable networking now I get the error similar to Issue #92

#!bash

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':runJavaDelegate'.
> Could not resolve all dependencies for configuration ':ros'.
   > Could not determine artifacts for xml-apis:xml-apis:2.0.2
      > Could not get resource 'https://jcenter.bintray.com/xml-apis/xml-apis/2.0.2/xml-apis-2.0.2.jar'.
         > Could not HEAD 'https://jcenter.bintray.com/xml-apis/xml-apis/2.0.2/xml-apis-2.0.2.jar'.
            > jcenter.bintray.com

I did the Pre-build ihmc_ros_java_adapter step and it completes in about 4 seconds (it used to take 8 minutes). So I tried the steps in Issue #86 but with no luck.

osrf-migration commented 7 years ago

Original comment by Akihiko Honda (Bitbucket: akiHonda).


I got simillar error after update. Gazebo dies after sending the error.

:runJavaDelegate FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':runJavaDelegate'.
> Could not resolve all dependencies for configuration ':ros'.
   > Could not find us.ihmc:RobotEnvironmentAwareness:0.2.2.
     Searched in the following locations:
         http://dl.bintray.com/ihmcrobotics/maven-vendor/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         http://dl.bintray.com/ihmcrobotics/maven-vendor/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
         http://dl.bintray.com/ihmcrobotics/maven-release/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         http://dl.bintray.com/ihmcrobotics/maven-release/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
         https://github.com/rosjava/rosjava_mvn_repo/raw/master/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         https://github.com/rosjava/rosjava_mvn_repo/raw/master/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
         http://clojars.org/repo/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         http://clojars.org/repo/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
         https://bengal.ihmc.us/nexus/content/repositories/thirdparty/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         https://bengal.ihmc.us/nexus/content/repositories/thirdparty/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
         http://updates.jmonkeyengine.org/maven/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         http://updates.jmonkeyengine.org/maven/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
         file:/home/user/.m2/repository/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         file:/home/user/.m2/repository/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
         https://jcenter.bintray.com/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         https://jcenter.bintray.com/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
         https://repo1.maven.org/maven2/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.pom
         https://repo1.maven.org/maven2/us/ihmc/RobotEnvironmentAwareness/0.2.2/RobotEnvironmentAwareness-0.2.2.jar
     Required by:
         :ihmc_ros_java_adapter:unspecified > us.ihmc:Valkyrie:0.9.0 > us.ihmc:IHMCAvatarInterfaces:0.9.0
osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


@jedediyah thanks for the thorough video and snippet. It looks like the controller isn't switching into high-level control mode. Maybe for debugging, you can try running things manually instead of using init:=true?

roslaunch srcsim unique_task1.launch &
# wait for arms and legs to get in position
rostopic pub -1 /valkyrie/harness/velocity std_msgs/Float32 '{data: -0.05}'
# lowering harness, wait for feet to touch the platform
rostopic pub -1 /ihmc_ros/valkyrie/control/low_level_control_mode ihmc_valkyrie_ros/ValkyrieLowLevelControlModeRosMessage '{requested_control_mode: 2, unique_id: -1}'
# should be switching to high-level control mode and trying to stand
osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


@akiHonda your error looks different actually and seems more similar to #92. Have you upgraded all the software packages?

osrf-migration commented 7 years ago

Original comment by Akihiko Honda (Bitbucket: akiHonda).


@scpeters Yes. Before updating, The sim works fine with unique.world. After the error, I tried to remove srcsim and reinstall based on the system setup tutorial, but no luck.

osrf-migration commented 7 years ago

Original comment by Jedediyah Williams (Bitbucket: Jedediyah).


I launched without init:='true' and after 7.5 minutes of sitting with just

:runJavaDelegate

it started downloading and building the controller (Starting here at line 193) and after 4 more minutes I was able to lower the harness and successfully start the high level controller with

#!bash
rostopic pub -1 /valkyrie/harness/velocity std_msgs/Float32 '{data: -0.05}'
rostopic pub -1 /ihmc_ros/valkyrie/control/low_level_control_mode ihmc_valkyrie_ros/ValkyrieLowLevelControlModeRosMessage '{requested_control_mode: 2, unique_id: -1}'

or just by running the init_robot.sh script. I believe the download and build should have occurred at step 13 in the setup instructions, but for some reason it isn't happening for me.

Unfortunately, I have to do this every time I launch, so it takes about 12 minutes to launch...

osrf-migration commented 7 years ago

Original comment by kapoor_amita@yahoo.com (Bitbucket: Amita94).


I think the issue I raised in [Issue #97](Link URL https://osrf-migration.github.io/srcsim-gh-pages/#!/osrf/srcsim/issues/97/val-falls (#97)) is related to this

When I launch without init:='true' I stay at :runJavaDelegate for even 15 minutes. No downloading or building starts.

And without download if I lower the harness and start high level control Val falls down.

osrf-migration commented 7 years ago

Original comment by kapoor_amita@yahoo.com (Bitbucket: Amita94).


Also after line 94 [Link Text](Link URL https://bitbucket.org/snippets/jedediyah/Ge9a5/controllerdownloads) I get error that the Thread-4 is already in use.

Screenshot _Val.png

osrf-migration commented 7 years ago

Original comment by Louise Poubel (Bitbucket: chapulina, GitHub: chapulina).


@Amita94, just a little bitbucket trick: image filenames can't have spaces or they don't show up.

osrf-migration commented 7 years ago

Original comment by kapoor_amita@yahoo.com (Bitbucket: Amita94).


Thanks @chapulina updated the image

osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


@jedediyah does it have to re-download jars every time? It should keep them cached after it's downloaded them once. I'll see if there's a way to run with more debugging info...

osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


You can try adding --info or --debug to the java command-line arguments in the configuration file in the ihmc_valkyrie_ros package. Here's an example patch for adding --info

diff --git a/configurations/val_wholebody_gazebo_param.yaml b/configurations/val_wholebody_gazebo_param.yaml
index de118cc..268fbb6 100644
--- a/configurations/val_wholebody_gazebo_param.yaml
+++ b/configurations/val_wholebody_gazebo_param.yaml
@@ -3,7 +3,7 @@ joint_state_controller:
     publish_rate: 50

 ihmc_valkyrie_control_java_bridge:
-    jvm_args: "-Djava.class.path=ValkyrieController.jar -XX:+UseSerialGC -Xmx4g -Xms4g -XX:NewSize=3g -XX:MaxNewSize=3g -XX:CompileThreshold=1000 -verbosegc -Djava.library.path=lib/"
+    jvm_args: "--info -Djava.class.path=ValkyrieController.jar -XX:+UseSerialGC -Xmx4g -Xms4g -XX:NewSize=3g -XX:MaxNewSize=3g -XX:CompileThreshold=1000 -verbosegc -Djava.library.path=lib/"
     main_class: us.ihmc.valkyrieRosControl.ValkyrieRosControlController
     type: ihmc_ros_control/IHMCWholeRobotControlJavaBridge

I think it's located in /opt/ros/indigo/share/ihmc_valkyrie_ros/configurations/val_wholebody_gazebo_param.yaml

osrf-migration commented 7 years ago

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


Any more info on your end @jedediyah?

osrf-migration commented 7 years ago

Original comment by Jedediyah Williams (Bitbucket: Jedediyah).


@nkoenig, thanks for checking in. Unfortunately I still seem to have gradle issues where during the prebuild (setup step 13) the output says that the build succeeded but it's finishing in 3 seconds where it used to take 8 minutes so I suspect it's not really building anything. I tried getting more info with the --info and --debug flags as Steve suggested, but there was no new output. Full reinstall gives the same behavior.

I have been able to get it to run if I

  1. Run roslaunch srcsim unique_task1.launch init:="false" and let it sit for about 14 minutes. This will download and build the controller.

  2. Close Gazebo, disable networking, and run roslaunch srcsim unique_task1.launch init:="true"

I wonder if has something to do with the Java updates for the new controller, but I really don't have much insight yet.

osrf-migration commented 7 years ago

Original comment by Richard Isakson (Bitbucket: RichardIsakson).


I as eliminated in the qualification round but I'm still using the robot as a learning tool. Recently, I inadvertently let the system do an update and the robot would no longer stay up. It would fall every time. I had lost my new toy. Tonight I redid the entire system setup procedure from the tutorial page and the robot works great.

osrf-migration commented 7 years ago

Original comment by GoRobotGo (Bitbucket: GoRobotGo).


I allowed the system to update today (March 29, 2017) and am suffering the same issue. It takes ~14 minutes after launching before the controller is ready. The 15 minute per run work-arounds described by Jedebiyah do work.

osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


@caguero are you seeing this same behavior with the 15 minute delay in loading?

osrf-migration commented 7 years ago

Original comment by Tom Tseemceeb Xiong (Bitbucket: Tom_Xiong).


Interestingly I also had this same issue but after completely reinstalling everything it's back to normal. I'm not really sure what happened but well... It's a solution if you can call that one.

osrf-migration commented 7 years ago

Original comment by Bener Suay (Bitbucket: benersuay).


I second @GoRobotGo

Also, something that might be related to this, when I try to run

#!c++

roslaunch ihmc_valkyrie_ros valkyrie_warmup_gradle_cache.launch

I get a bunch of UP-TO-DATEs, and then the execution gets stuck at

> Configuring > 0/1 projects > root project > Resolving dependencies ':ros'
osrf-migration commented 7 years ago

Original comment by Bener Suay (Bitbucket: benersuay).


@scpeters I tried @jedediyah's solution and it worked for me as well.

I really hope someone can find a fix for this.

osrf-migration commented 7 years ago

Original comment by Víctor López (Bitbucket: Victor Lopez).


Same here. 16 hours ago it was working fine. Today neither me nor any of my colleagues can't get valkyrie running.

Same behaviour as @benersuay , stuck on resolving dependencies.

I commented on the ticket I opened in the past that eventually got this issue fixed, maybe you can comment there so that they see it's not a localized problem on my setup.

https://github.com/ihmcrobotics/ihmc_valkyrie_ros/issues/9

osrf-migration commented 7 years ago

Original comment by Erica Tiberia (Bitbucket: T_AL).


Val started falling on me just now, I didn't do an update but was connected to the internet. Any solutions ?

osrf-migration commented 7 years ago

Original comment by Jedediyah Williams (Bitbucket: Jedediyah).


@T_AL, I don't know if this will help, but if you haven't already, try re-running step 13,

#!bash
source /opt/nasa/indigo/setup.bash
roslaunch ihmc_valkyrie_ros valkyrie_warmup_gradle_cache.launch

I have noticed that the gradle prebuild seems to effectively "expire" in some way, but sometimes when I rerun that step it starts working again for a couple days.

osrf-migration commented 7 years ago

Original comment by Erica Tiberia (Bitbucket: T_AL).


Same issue as Bener Suay its stuck on,

Configuring > 0/1 projects > root project > Resolving dependencies ':ros'

osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


I hadn't been seeing this problem, but it just started happening in one of my docker containers. I added --debug to the gradle warmup launch file:

diff --git a/launch/valkyrie_warmup_gradle_cache.launch b/launch/valkyrie_warmup_gradle_cache.launch
index ad4e85b..478194a 100644
--- a/launch/valkyrie_warmup_gradle_cache.launch
+++ b/launch/valkyrie_warmup_gradle_cache.launch
@@ -1,6 +1,6 @@
 <launch>
   <arg name="use_local_build" default="false" />

-  <node name="IHMCValkyrieROSAPI" pkg="ihmc_ros_java_adapter" type="gradlew" args="warmUp -x runJavaDelegate -PuseLocal=$(arg use_local_build) -Pyaml=$(find ihmc_valkyrie_ros)/configurations/api.yaml" required="true" output="screen" cwd="node">
+  <node name="IHMCValkyrieROSAPI" pkg="ihmc_ros_java_adapter" type="gradlew" args="--debug warmUp -x runJavaDelegate -PuseLocal=$(arg use_local_build) -Pyaml=$(find ihmc_valkyrie_ros)/configurations/api.yaml" required="true" output="screen" cwd="node">
   </node>
 </launch>

and it appears to be waiting to download files and getting a 500 status. I'm still investigating...

osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


Here's the full output that I see:

osrf-migration commented 7 years ago

Original comment by Jedediyah Williams (Bitbucket: Jedediyah).


Yeah, I see the same thing, but if I wait somewhere between 7 and 15 minutes it builds silently and eventually completes successfully. I usually start it and go make some tea. After that, Val will usually stay standing again.

osrf-migration commented 7 years ago

Original comment by Aric Stewart (Bitbucket: aricstewart).


I am dead in the water with this same error as well. Cannot get val to stand. Extremely frustrating as tonight was going to be my best block of time to get some work done.

When trying valkyrie_warmup_gradle_cache.launch stuck on the very same

Configuring > 0/1 projects > root project > Resolving dependencies ':ros'

osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


osrf-migration commented 7 years ago

Original comment by Bener Suay (Bitbucket: benersuay).


@aricstewart we share your frustration. Until IHMC or OSRF pushes a fix, try to focus on the big picture and give @jedediyah's solution a try.

osrf-migration commented 7 years ago

Original comment by Bener Suay (Bitbucket: benersuay).


@scpeters thank you for marking this as a blocker.

osrf-migration commented 7 years ago

Original comment by Rud Merriam (Bitbucket: rmerriam).


Stopped working for me mid-day on Friday.

I'm running the warm up with --debug. It's hanging on every attempt to GET a file named /maven-metadata.xml. It times out with a 500 error. There is an earlier URL for downloading that file. Trying it in the browser generates the error:

#!c++
HTTP Status 500 - Could not execute search query for username='anonymous'
type Exception report
message Could not execute search query for username='anonymous'
description The server encountered an internal error that prevented it from fulfilling this request.

Maybe it's a security setting that got munged?

osrf-migration commented 7 years ago

Original comment by Aric Stewart (Bitbucket: aricstewart).


Not that I expected things to change much overnight on a Friday night, But I am still in the same situation this morning.

osrf-migration commented 7 years ago

Original comment by sringer99 (Bitbucket: sringer99).


The gradle issue did not work all day yesterday or Friday. It did work this morning. It appears to be an issue out on the Internet when it tries to get the files it needs.

I finally got my new machine working. I had a problem with the rtprio setting not sticking. Was finally forced to grant myself the rtprio in order to get it to work. Not sure why.

osrf-migration commented 7 years ago

Original comment by Rud Merriam (Bitbucket: rmerriam).


Was able to download a completely new gradle setup today. Deleted the .gradle folder and ran the warm up script. The R5 is now standing.

osrf-migration commented 7 years ago

Original comment by Bener Suay (Bitbucket: benersuay).


It looks like this is working now, however /ihmc_ros/valkyrie/control/hand_trajectory listener is behaving really strange. I don't know if this is related to robot controllers / IK solvers.

Can anybody confirm that hand trajectory controllers do not follow the requested transforms' orientation? I did not change my code and it was working before I had this issue just fine.

That said, I'm using an updated version of the URDF, which might be causing this behavior too, which is why I just wanted to ask and see if others are having any issues with hand trajectories.

osrf-migration commented 7 years ago

Original comment by Louise Poubel (Bitbucket: chapulina, GitHub: chapulina).


osrf-migration commented 7 years ago

Original comment by Brooks Paige (Bitbucket: brx).


Val hasn't been able to stand up for me since updating to srcsim 0.4.0. I've tried completely removing and reinstalling from scratch, but to no effect. Val simply collapses at the end of the init script, when released from the harness (/valkyrie/harness/detach).

How can I go about debugging this? If it helps, after the "Lower harness" step, Val has a visibly not-level pelvis; I don't recall this being the case before.

In the meantime, is there any supported way to roll back to an earlier version of the software?

osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


@brx how you run the warmup script?

roslaunch ihmc_valkyrie_ros valkyrie_warmup_gradle_cache.launch

that is needed to download all the controller dependencies

osrf-migration commented 7 years ago

Original comment by Brooks Paige (Bitbucket: brx).


Yes, exactly like so. It seems to run cleanly (terminal output).

I did try deleting the ~/.gradle directory and re-running the warmup script, but with no change.

osrf-migration commented 7 years ago

Original comment by Aric Stewart (Bitbucket: aricstewart).


My laptop works, however my backup VM shows the same problem. I can get the Val to stand in my VM if i add grasping_init:=false to the roslaunch. However of course that means that the val_grasping is disabled.

osrf-migration commented 7 years ago

Original comment by Jedediyah Williams (Bitbucket: Jedediyah).


Similar here. Between my two machines, one works most of the time and one doesn't. The one that does tends to break overnight but I can usually get it back with something like:

  1. roslaunch ihmc_valkyrie_ros valkyrie_warmup_gradle_cache.launch
  2. Log out / Log in
  3. Disable Networking
  4. Launch something like: roslaunch srcsim unique_task1.launch init:="true"
  5. Enable Networking
osrf-migration commented 7 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


Hm... if grasping_init is the problem, perhaps you could try modifying the grasping_init_wait_time variable? Since it's on your VM, I'm guessing you'll need to increase it from the default of 20 seconds.

roslaunch srcsim unique_task1.launch grasping_init_wait_time:=30

Maybe that will help?

osrf-migration commented 7 years ago

Original comment by Aric Stewart (Bitbucket: aricstewart).


For my VM that does appear to be the case. I have to bump grasping_init_wait_time:=40 so my startup time is taking crazy forever now. But Val does stand.

-aric

osrf-migration commented 7 years ago

Original comment by kapoor_amita@yahoo.com (Bitbucket: Amita94).


After last update Val in unique task is again falling, problem this time

The Val is working perfectly for two qualifying tasks still.

osrf-migration commented 7 years ago

Original comment by Brooks Paige (Bitbucket: brx).


Confirming that it works for me with grasping_init:="false" as well.

osrf-migration commented 7 years ago

Original comment by Jeremy White (Bitbucket: knitfoo).


As a note on this, I was having more 'fall at start' failures than I expected a while back, and I very deliberately added a user to my system and made that user the 'srcsim' user. So that user is in the real time group. I then removed the real time priority from my regular user which I use for web browsing and running our code. It's a mild chore, as you've got to repeat the setup steps, chown and all, for the new user, but after doing that, I've seen far fewer 'fall down on start' issues.

osrf-migration commented 7 years ago

Original comment by Jeremy White (Bitbucket: knitfoo).


I know this is an oldie, but a goodie. This fundamental issue - Val falling down at the start - still persists across all our systems. It's a nuisance; perhaps 20% of the time, but quite measurable. With the new harness logic, it now looks like Val is just slumping on the job, instead of sprawling on the floor, but the root cause is presumably the same. I do notice that if I have another process chewing cpu in the system when the sim starts, that seems to help trigger the failure. (But that's not a requirement).

osrf-migration commented 7 years ago

Original comment by Louise Poubel (Bitbucket: chapulina, GitHub: chapulina).


@knitfoo , do you know if the times when Val falls are due to issue #181?

osrf-migration commented 7 years ago

Original comment by Jeremy White (Bitbucket: knitfoo).


No, I don't see that error in the logs in these cases. In fact, maddeningly, I can't find any indication of trouble of any kind in the logs.

osrf-migration commented 7 years ago

Original comment by Jeremy White (Bitbucket: knitfoo).


I've been tracking linux style forensics on this, and I have a little bit more information. So I run srcsim as a specific user, and then I was doing an lsof -u for the 'good' case and for the 'bad' case. (First I did a ps, but that showed no difference).

But the lsof shows that a process like this: /opt/nasa/indigo/lib/val_controller_manager_rtt/controller_exec --rate 500 -s name:=controller_manager_jertop_32535_5316819501857004205 log:=/home/high/.ros/log/b8d9e6b2-4933-11e7-bd77-fcf8ae5ce333/controller_manager_jertop_32535_5316819501857004205-1.log

has, in the 'good' case an open handle to libjoint_state_controller.so and to libposition_controllers.so; the 'bad' case does not have those two files open. Which makes sense, as we're not seeing a controller. The log files in question are both empty.

A naive --help on the controller_exec binary doesn't show any additional debugging I can turn on; anything else I can do to investigate?