Human-friendly data collection system for object segmentation

jsk-ros-pkg / jsk_demos

JSK demo programs

https://github.com/jsk-ros-pkg/jsk_demos

25 stars 89 forks source link

Human-friendly data collection system for object segmentation #1218

Open wkentaro opened 7 years ago

wkentaro commented 7 years ago

The components of this system is

Speech recognition
Find moving object in pixel-level
Track object in pixel-level or roi-level

For 1. @furushchev Do you know good ros node for speech recognition?

For 3. @iKrishneel Do you know good ros node for tracking in pixel-wise? I expect it uses point cloud, because roi-level tracking is common for 2D image: ex ConsensusTracking

tl;dr

I and @inabajsk talked about the importance of human-friendly data collection interface for object segmentation. For example:

Human says to Pepper "Now I teach you this object".
Pepper says "Ok, what is that object name?"
Human says "wallet"
Pepper says "Ok, please show me"
Human moves object in front of Pepper.
Pepper do
- Find moving object
- Track it
- Record it

furushchev commented 7 years ago

For speech recognition, I think google speech recognition is a defacto standard. I have simple speech recognition node for it. If you teach me where to commit, I'll send a pull request. Or you can easily try with rwt_speech_recognition. (but it needs chrome)

iKrishneel commented 7 years ago

@wkentaro Do you want to track in 3D or 2D?

wkentaro commented 7 years ago

@furushchev

For speech recognition, I think google speech recognition is a defacto standard. I have simple speech recognition node for it. If you teach me where to commit, I'll send a pull request. Or you can easily try with rwt_speech_recognition. (but it needs chrome)

Could you please send PR to jsk_recognition? (probably jsk_perception or new package)

Is this google speech recognition node? https://github.com/jsk-ros-pkg/jsk_recognition/pull/1249

@iKrishneel

Both is ok for me, but I prefer pixel/point-wise tracking.

wkentaro commented 7 years ago

For 2. @iory Do you know some program to find moving object? I expect you know much about finding object human holding, for imitation learning.

k-okada commented 7 years ago

2017-04-16 18:52 GMT+09:00 Kentaro Wada notifications@github.com:

Could you please send PR to jsk_recognition? (probably jsk_perception or new package)

I would recommend to use https://play.google.com/store/apps/details?id=org.ros.android.android_voice_message, since mic on mobile phone is highly optimized for recognizing human voice.

-- ◉ Kei Okada

iKrishneel commented 7 years ago

@wkentaro Pixel-wise tracking is not common although there are methods that does it is not as competitive as region or part based methods. Besides it also suffers under harsh image conditions. I wrote a tracking algorithm which is a region based tracking where object region is defined by bounding box. I will update this on my github soon.

wkentaro commented 7 years ago

@iKrishneel @k-okada @furushchev Thank you all.

Currently, I selected the use of optical flow and the demo movies are below: HRP2 camera RGB-D: https://drive.google.com/open?id=0B9P1L--7Wd2vaVlHRC10Z2dwQkk Just camera RGB: https://drive.google.com/open?id=0B9P1L--7Wd2vR3ktR0xUSFRMdkk

And the collected images are such like below:

What I found are

Optical flow works
Use of depth is not effective because it leads small object image (because of depth range)
Tracking is not needed because I'd like to collect image when object pose is changed