sbrodeur / ros-icreate-bbb

ROS (Robotic Operating System) packages for the iRobot Create with onboard BeagleBone Black
BSD 3-Clause "New" or "Revised" License
3 stars 2 forks source link

Integrate language tagging for actions #40

Open sgcarrier opened 7 years ago

sgcarrier commented 7 years ago

We want to add language to our dataset in order for the robot to convey it's state in words. This will make it easier to interpret for humans and will provide a bridge for IGLU. We will focus on actions the robot makes rather than perceptions he has. In other words, he will say "Moving Forward", but not "I see an apple".

I propose that for each recoding we generate a YAML (.yml) file that will be used to in turn generate a rosbag containing only the language messages. After that, we fuse the language and original rosbags together. Here is an example of a YAML file in this context.

--- !Example Location: FLOOR3 Session: S3 Date: 20161212T163700 SystemTimeStart: 12345678.123456 actions:

- name: "Move"
  rank: 0
  time: [0.0, 15.1, 65.7 ]
  duration: [14.0, 49.2, 10.5]

- name: "Forward"
  rank: 1
  time: [0.0, 15.1, 65,7 ]
  duration: [14.0, 49.2, 10.5]

In this case, the robot only moved forward at times 0.0, 15.1 and 65.7 for durations 14.0, 49.2 and 10.5 respectively.

Some actions could be populated automatically in the YAML file (for example, Forward, turn, move, stop, etc) since they have implicit and clear associations to certain values (Forward movement is implied when both wheels are activated with same power). However, some higher level actions will have to be added by humans to the YAML file. For example, "wandering" doesnt have and implicit associated value in the rosbags so we will have to set it ourselves.

sbrodeur commented 7 years ago

For the manual annotations, we could use rqt_bag tool: rqt_bag_annotation

We don't actually see the video (only thumbnails), but it is easy to navigate the timeline for useful sensors (e.g. gyroscope and motor velocities). We also have direct access to the timestamps.

sbrodeur commented 7 years ago

Since we can generate videos from sensors, we could also use the ELAN annotation tool: http://www.lrec-conf.org/proceedings/lrec2004/pdf/480.pdf

It is supposed to support multiple synchronized media sources. Some other projects are using this:

sgcarrier commented 7 years ago

Tasks to complete:

Here is a graph of possible labels , those label dependencies and their possible attributes roomba_actions_graph