nasa / astrobee_android

NASA Astrobee Robot Software, Android
https://www.nasa.gov/astrobee
Apache License 2.0
112 stars 49 forks source link

Unable to Connect with Guest Science Manager #47

Closed erobinson-1997 closed 1 year ago

erobinson-1997 commented 1 year ago

I am using an Inforce 6640 HLP Development board and a Dell PC tower to develop a Guest Science application. The development board is running the correct Inforce-IFC6601-AndroidBSP-880457-Rel-v2.1.zip image, boots, and has been tested with a mouse and monitor. I am using Ubuntu 16.04 on the desktop PC with a cat5 cable going to the development board. The development board has been configured to use adb via IP rather than USB, and the Guest Science Manager has been installed. The Guest Science Manager runs via the gs_manager.sh script (confirmed with adb shell ps | grep gov), and with Android Studio v3.6.3 I am also able to attach a debugger to the guest_science_manager app (output shown later).

On both devices the /etc/hosts file has been updated to provide definitions for hlp and llp. The Dell Ubuntu computer has a "Manual" connection made with the GUI as shown below: image

Android /etc/hosts:

127.0.0.1       localhost
::1             ip6-localhost

10.42.0.31  llp
10.42.0.33  hlp

Dell PC /etc/hosts:

127.0.0.1   localhost
127.0.1.1   mei-Dell

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

10.42.0.31  llp
10.42.0.33  hlp

I pushed the text_file.sh described in the HLP installation instructions to /persist/eth0.sh as shown below:

#!/bin/sh

set -e

sleep 10

ip addr add 10.42.0.33/24 dev eth0
ip link set dev eth0 up

ip rule flush
ip rule add pref 32766 from all lookup main
ip rule add pref 32767 from all lookup default

In two terminals I have executed the following commands with resulting output to define ROS_MASTER_URI. Both terminals are able to ping eachother: image

From the Android adb shell to the development board, I am also able to ping the Dell PC as shown below: image

This is the command and output when I run the Astrobee simulator:

mei@mei-Dell:~/astrobee$ roslaunch astrobee sim.launch dds:=false robot:=sim_pub rviz:=true
... logging to /home/mei/.ros/log/b91cc490-e3c0-11ed-8836-e454e8d86199/roslaunch-mei-Dell-12295.log
Checking log directory for disk usage. This may take awhile.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://10.42.0.31:38651/

SUMMARY
========

PARAMETERS
 * /dock/robot_description: <?xml version="1....
 * /handrail_21_5/robot_description: <?xml version="1....
 * /handrail_30/robot_description: <?xml version="1....
 * /handrail_41_5/robot_description: <?xml version="1....
 * /handrail_8_5/robot_description: <?xml version="1....
 * /iss/robot_description: <?xml version="1....
 * /robot_description: <?xml version="1....
 * /rosdistro: kinetic
 * /rosversion: 1.12.17
 * /simulation_speed: 1
 * /use_sim_time: True

NODES
  /
    access_control (nodelet/nodelet)
    arm (nodelet/nodelet)
    astrobee_state_publisher (robot_state_publisher/robot_state_publisher)
    choreographer (nodelet/nodelet)
    ctl (nodelet/nodelet)
    data_bagger (nodelet/nodelet)
    depth_odometry_nodelet (nodelet/nodelet)
    dock (nodelet/nodelet)
    executive (nodelet/nodelet)
    fam (nodelet/nodelet)
    framestore (nodelet/nodelet)
    gazebo (astrobee_gazebo/start_server)
    global_transforms (framestore/global_transforms)
    graph_loc (nodelet/nodelet)
    handrail_detect (nodelet/nodelet)
    image_sampler (nodelet/nodelet)
    imu_aug (nodelet/nodelet)
    imu_calibration (rosservice/rosservice)
    llp_gnc (nodelet/nodelet)
    llp_i2c (nodelet/nodelet)
    llp_imu (nodelet/nodelet)
    llp_imu_aug (nodelet/nodelet)
    llp_lights (nodelet/nodelet)
    llp_monitors (nodelet/nodelet)
    llp_pmc (nodelet/nodelet)
    llp_serial (nodelet/nodelet)
    localization_manager (nodelet/nodelet)
    mapper (nodelet/nodelet)
    mlp_arm (nodelet/nodelet)
    mlp_communications (nodelet/nodelet)
    mlp_depth_cam (nodelet/nodelet)
    mlp_dock (nodelet/nodelet)
    mlp_graph_localization (nodelet/nodelet)
    mlp_localization (nodelet/nodelet)
    mlp_management (nodelet/nodelet)
    mlp_mapper (nodelet/nodelet)
    mlp_mobility (nodelet/nodelet)
    mlp_monitors (nodelet/nodelet)
    mlp_multibridge (nodelet/nodelet)
    mlp_perch (nodelet/nodelet)
    mlp_recording (nodelet/nodelet)
    mlp_serial (nodelet/nodelet)
    mlp_states (nodelet/nodelet)
    mlp_vision (nodelet/nodelet)
    mlp_vive (nodelet/nodelet)
    perch (nodelet/nodelet)
    planner_qp (nodelet/nodelet)
    planner_trapezoidal (nodelet/nodelet)
    rviz_node (rviz/rviz)
    spawn_astrobee (astrobee_gazebo/spawn_model)
    states (nodelet/nodelet)
    sys_monitor (nodelet/nodelet)

auto-starting new master
process[master]: started with pid [12308]
ROS_MASTER_URI=http://10.42.0.31:11311

setting /run_id to b91cc490-e3c0-11ed-8836-e454e8d86199
process[rosout-1]: started with pid [12321]
started core service [/rosout]
process[global_transforms-2]: started with pid [12345]
process[rviz_node-3]: started with pid [12351]
process[gazebo-4]: started with pid [12364]
process[astrobee_state_publisher-5]: started with pid [12369]
process[spawn_astrobee-6]: started with pid [12370]
process[llp_gnc-7]: started with pid [12376]
process[llp_imu_aug-8]: started with pid [12393]
process[llp_monitors-9]: started with pid [12405]
process[llp_i2c-10]: started with pid [12406]
process[llp_serial-11]: started with pid [12424]
process[llp_pmc-12]: started with pid [12453]
process[llp_imu-13]: started with pid [12467]
process[llp_lights-14]: started with pid [12476]
process[imu_aug-15]: started with pid [12560]
process[ctl-16]: started with pid [12576]
process[fam-17]: started with pid [12596]
process[mlp_localization-18]: started with pid [12627]
process[mlp_graph_localization-19]: started with pid [12651]
process[mlp_vision-20]: started with pid [12684]
process[mlp_depth_cam-21]: started with pid [12710]
process[mlp_mapper-22]: started with pid [12718]
process[mlp_management-23]: started with pid [12729]
process[mlp_recording-24]: started with pid [12740]
process[mlp_monitors-25]: started with pid [12746]
process[mlp_communications-26]: started with pid [12766]
process[mlp_multibridge-27]: started with pid [12774]
process[mlp_serial-28]: started with pid [12787]
process[mlp_mobility-29]: started with pid [12801]
process[mlp_arm-30]: started with pid [12823]
process[mlp_dock-31]: started with pid [12857]
process[mlp_perch-32]: started with pid [12886]
process[mlp_vive-33]: started with pid [12896]
process[mlp_states-34]: started with pid [12912]
process[localization_manager-35]: started with pid [12915]
process[handrail_detect-36]: started with pid [12935]
process[depth_odometry_nodelet-37]: started with pid [12988]
process[graph_loc-38]: started with pid [13000]
process[image_sampler-39]: started with pid [13010]
process[mapper-40]: started with pid [13026]
process[planner_qp-41]: started with pid [13039]
process[choreographer-42]: started with pid [13068]
process[planner_trapezoidal-43]: started with pid [13099]
process[framestore-44]: started with pid [13126]
process[dock-45]: started with pid [13154]
process[arm-46]: started with pid [13180]
process[perch-47]: started with pid [13192]
process[states-48]: started with pid [13218]
process[executive-49]: started with pid [13237]
process[access_control-50]: started with pid [13272]
process[data_bagger-51]: started with pid [13333]
process[sys_monitor-52]: started with pid [13370]
process[imu_calibration-53]: started with pid [13403]
[spawn_astrobee-6] process has finished cleanly
log file: /home/mei/.ros/log/b91cc490-e3c0-11ed-8836-e454e8d86199/spawn_astrobee-6*.log
[imu_calibration-53] process has finished cleanly
log file: /home/mei/.ros/log/b91cc490-e3c0-11ed-8836-e454e8d86199/imu_calibration-53*.log

When running python3 ~/astrobee/src/tools/gds_helper/src/gds_simulator.py I get the following output: image

Here is the output from Logcat in Android Studio while the gds_simulator.py script is trying to connect to the Guest Science Manager app:

1970-02-01 07:31:12.494 3065-3089/gov.nasa.arc.astrobee.android.gs.manager I/DefaultPublisher: Publisher registration failed: Publisher<PublisherDefinition<PublisherIdentifier<NodeIdentifier</guest_science_manager, http://hlp:39977/>, TopicIdentifier</rosout>>, Topic<TopicIdentifier</rosout>, TopicDescription<rosgraph_msgs/Log, acffd30cd6b6de30f120938c17c593fb>>>>
1970-02-01 07:31:27.520 3065-3091/gov.nasa.arc.astrobee.android.gs.manager E/Registrar: Exception caught while communicating with master.
    org.ros.internal.node.xmlrpc.XmlRpcTimeoutException: org.apache.xmlrpc.client.TimingOutCallback$TimeoutException: No response after waiting for 10000 milliseconds.
        at org.ros.internal.node.xmlrpc.XmlRpcClientFactory$1.invoke(XmlRpcClientFactory.java:140)
        at java.lang.reflect.Proxy.invoke(Proxy.java:813)
        at $Proxy0.registerPublisher(Unknown Source)
        at org.ros.internal.node.client.MasterClient.registerPublisher(MasterClient.java:145)
        at org.ros.internal.node.client.Registrar$1$1.call(Registrar.java:138)
        at org.ros.internal.node.client.Registrar$1$1.call(Registrar.java:135)
        at org.ros.internal.node.client.Registrar.callMaster(Registrar.java:111)
        at org.ros.internal.node.client.Registrar.access$100(Registrar.java:51)
        at org.ros.internal.node.client.Registrar$1.call(Registrar.java:135)
        at org.ros.internal.node.client.Registrar$1.call(Registrar.java:132)
        at java.util.concurrent.FutureTask.run(FutureTask.java:237)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:428)
        at java.util.concurrent.FutureTask.run(FutureTask.java:237)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
        at java.lang.Thread.run(Thread.java:761)
     Caused by: org.apache.xmlrpc.client.TimingOutCallback$TimeoutException: No response after waiting for 10000 milliseconds.
        at org.apache.xmlrpc.client.TimingOutCallback.waitForResponse(TimingOutCallback.java:77)
        at org.ros.internal.node.xmlrpc.XmlRpcClientFactory$1.invoke(XmlRpcClientFactory.java:138)
        at java.lang.reflect.Proxy.invoke(Proxy.java:813) 
        at $Proxy0.registerPublisher(Unknown Source) 
        at org.ros.internal.node.client.MasterClient.registerPublisher(MasterClient.java:145) 
        at org.ros.internal.node.client.Registrar$1$1.call(Registrar.java:138) 
        at org.ros.internal.node.client.Registrar$1$1.call(Registrar.java:135) 
        at org.ros.internal.node.client.Registrar.callMaster(Registrar.java:111) 
        at org.ros.internal.node.client.Registrar.access$100(Registrar.java:51) 
        at org.ros.internal.node.client.Registrar$1.call(Registrar.java:135) 
        at org.ros.internal.node.client.Registrar$1.call(Registrar.java:132) 
        at java.util.concurrent.FutureTask.run(FutureTask.java:237) 
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:428) 
        at java.util.concurrent.FutureTask.run(FutureTask.java:237) 
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133) 
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607) 
        at java.lang.Thread.run(Thread.java:761) 
jvbenavi commented 1 year ago

As a clarification that could help later, that Inforce Android OS image you reference, you download it from the Inforce website, right?

erobinson-1997 commented 1 year ago

Yes, I got it from the website. They don't host the file; I had to contact them directly for the iso mentioned in the readme. The OS seems to be working fine. Today I made an Android Studio app for the HLP board that uses rosjava to communicate with a separate ROS package I made.

erobinson-1997 commented 1 year ago

Something that I noticed while continuing to develop for my project was that the hostnames on the HLP development board don't resolve to IP addresses in the Android rosjava apps. I can ping the hostnames from the adb shell, but the hostnames are just regular strings within the context of my MainActivity.

I will be trying hard-codded IPs in the guest science manager later this evening. I'll be sure to leave my notes and solutions here.

erobinson-1997 commented 1 year ago

Yep, I hard-codded the IP addresses and was able to connect to the guest science manager. Not sure if I should close this issue. There might be a way to get the hostnames to work.

rgarciaruiz commented 1 year ago

Any chance you have this APK source code on a repo accessible to us, so we could try to replicate the issue?

erobinson-1997 commented 1 year ago

I took another look into the issue and I was able to get the guest science manager working with the hostnames in /etc/hosts.

I think the Guest Science Manager successfully connects with the LLP only once on startup, which means the LLP software always needs to be launched first. I did a test where I launched the Guest Science Manager first, and a Publisher registration fails until the Master_URI starts running. We can see it open up on 10.42.0.31 in the output below, and a publisher is successfully registered; however, the GDS local simulator will not connect.

1970-03-10 07:22:58.828 28303-28327/gov.nasa.arc.astrobee.android.gs.manager I/DefaultPublisher: Publisher registration failed: Publisher<PublisherDefinition<PublisherIdentifier<NodeIdentifier</guest_science_manager, http://hlp:46897/>, TopicIdentifier</rosout>>, Topic<TopicIdentifier</rosout>, TopicDescription<rosgraph_msgs/Log, acffd30cd6b6de30f120938c17c593fb>>>>
1970-03-10 07:23:03.889 28303-28329/gov.nasa.arc.astrobee.android.gs.manager I/Registrar: Response<Success, Registered [/guest_science_manager] as publisher of [/rosout], [http://10.42.0.31:33921/]>
1970-03-10 07:23:03.891 28303-28327/gov.nasa.arc.astrobee.android.gs.manager I/DefaultPublisher: Publisher registered: Publisher<PublisherDefinition<PublisherIdentifier<NodeIdentifier</guest_science_manager, http://hlp:46897/>, TopicIdentifier</rosout>>, Topic<TopicIdentifier</rosout>, TopicDescription<rosgraph_msgs/Log, acffd30cd6b6de30f120938c17c593fb>>>>
erobinson-1997 commented 1 year ago

I would like to add that the only changes I made to the code are shown in the screenshot below. I have a public forked repository here, but it doesn't include the hard-code IP addresses.

Screenshot from 2023-06-01 16-21-33

rgarciaruiz commented 1 year ago

I want to make sure I understand: you are saying you were able to finally have the GS manager connect (using hostnames) to the LLP (based on the logcat output) but the gds_simulator.py still can't find it, correct?

Could you please check the output of /gs/gs_manager/config and /gs/gs_manager/state? If you see output there about your GS APK, then the problem is the gds_simulator.py or its environment. If you do not see output on those topics then it means the GS manager is still not really connecting. I am also assuming you have your GS APK installed to your HLP board.

Please make sure to always start the Astrobee ROS simulator first, then run gs_manager.sh restart. Run this command every time you restart the ROS sim, and/or when the network resets. Also, make sure the HLP screen is always on and awake. Android sometimes blocks network communications if that is not the case.

Let me know how it goes.

erobinson-1997 commented 1 year ago

I did a test where I was able to connect to the guest science manager with the gds_simulator.py script. I then was presented with a list of guest science applications, and I launched a guest science app.

Everything seems to be working fine with no issue. I think I was just launching things out of order until I coincidentally launched things correctly when I changed the hostnames to hard-coded IPs. The Logcat output was also displaying the hostnames instead of showing the actual IP addresses, so I thought they were being interpreted as string literals. When I connect successfully while using hostnames I get a mix of raw IPs and hostnames in my Logcat output as shown below:

Screenshot from 2023-06-01 16-29-08

erobinson-1997 commented 1 year ago

I am going to re-clone, re-build, and re-install just to make sure. I'll close this issue if everything works fine.