Proposal of new 3D object segmentation pipeline

wkentaro commented 7 years ago

I think current 3D object segmentation pipeline is messy, and it can be created by integrating nodes in jsk_recognition package (currently there is many apc specific codes in jsk_2016_01_baxter_apc). I already worked for the generalization of 3D object segmentation with FCN + PointCloudRegistration, and validated the efficiency on Humanoids 2016 papers Below is the proposing pipeline:

k-okada commented 7 years ago

my comment is;

apply same pipeline for stow task
create general pipeline which does not depends on apc environment, maybe from camera image to label image and/or pose&size of target object
- enable to run when no yaml file is installed,
- enable to run when no attention clipper is running
- enable to use original vgg network so that we can run this pipeliene on different robot or tabletop depth camera

◉ Kei Okada

On Thu, Aug 4, 2016 at 2:30 AM, Kentaro Wada notifications@github.com wrote:

I think current 3D object segmentation pipeline is messy, and it can be created by integrating nodes in jsk_recognition package (currently there is many apc specific codes in jsk_2016_01_baxter_apc). I already worked for the generalization of 3D object segmentation with FCN

PointCloudRegistration, and validated the efficiency on Humanoids 2016 papers Below is the proposing pipeline:

[image: picture1] https://cloud.githubusercontent.com/assets/4310419/17374157/3e7943f0-59e6-11e6-873c-c01819ec8204.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/start-jsk/jsk_apc/issues/1865, or mute the thread https://github.com/notifications/unsubscribe-auth/AAeG3FVDQnGBMpzq3GEEcji9YiNRc7wWks5qcNAzgaJpZM4Jb3jh .

wkentaro commented 7 years ago

apply same pipeline for stow task

Same pipeline has been already used in below tasks

APC by HRP2

Fetching a can from fridge by PR2

Fetching a can from fridge by HRP2

Enable to run when no yaml file is installed

Well, in that case, how is the target object specified?

enable to run when no attention clipper is running

Why is it important? I think it is reasonable that the shelf position is localized roughly.

enable to use original vgg network so that we can run this pipeliene on different robot or tabletop depth

Actually, VGG network is for Object Recognition and FCN is for Object Segmentation, so they are different networks.

enable to use original vgg network so that we can run this pipeliene on different robot or tabletop depth

What do you mean with so that? Current network also can be used on different robot and tabletop depth. FYI, the tabletop object segmentation to fetching it is being tackled by @h-kamada with PR2.

k-okada commented 7 years ago

on below, vgg means fcn using vgg network, which I think general object recognition/segmentation, and the pipeline in this PR is for apc 40 object recognition/segmentation, is that correct?

and we're assuming four situation

1) those who sitting next to apc shelf, can access to shelf and apc object 2) those who have apc object, but not have apc shelf, try to pick apc object on the table 3) thoss who do not have apc stuff, pick coke can on the table 4) those who do not have apc object, but have shelf, pick coke can in the shelf

and two condition a) robot knows what to pick (apc rule), before execution, robot thinks "there must be coke in this shell, so I need to pick this" b) robot want to know what in the shelf, and then decide what to pick., so system returns "there is an cheese it and coke in the shelf", then robot decided to "ok, pick cheese it , then coke"

Well, in that case, how is the target object specified?

I'm thinking about b) case, I think it is possible to return object label from all possible 40 object, if target object is not given.

Why is it important? I think it is reasonable that the shelf position is

localized roughly.

I'm assuming scene without shelf, so just look for object from all scene if no attention clipper is given,

enable to use original vgg network so that we can run this pipeliene on different robot or tabletop depth

Actually, VGG network is for Object Recognition and FCN is for Object Segmentation, so they are different networks.

As far as i understand, if we use vgg network without fine tune on fcn, the network can detect any object, not 40 apc object, I assume the result is not so good, but that's ok for people who try to start using this pipeline,

What do you mean with so that? Current network also can be used on different robot and tabletop depth. FYI, the tabletop object segmentation to fetching it is being tackled by @h-kamada https://github.com/h-kamada with PR2.

it seems you're assuming use case a-1) with baxter and hrp2, and a-2) with pr2, but need to have attention clipper, if we can run pipeline with a-2 without attention clipper @h-kamada will show tabletop segmentation within 1 hour, and if he spend few more our, he can use a-2) with attention clipper, I think that's our goal.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/start-jsk/jsk_apc/issues/1865#issuecomment-237580166, or mute the thread https://github.com/notifications/unsubscribe-auth/AAeG3NP5jB5uh8gfIgFs3W92Vp44KoFuks5qcf6DgaJpZM4Jb3jh .

k-okada commented 7 years ago

On Fri, Aug 5, 2016 at 10:12 AM, Kei Okada k-okada@jsk.t.u-tokyo.ac.jp wrote:

I'm assuming scene without shelf, so just look for object from all scene if no attention clipper is given,

or, set default value of attention clipper to 5mx5mx5m may ok

◉ Kei Okada

wkentaro commented 7 years ago

on below, vgg means fcn using vgg network, which I think general object recognition/segmentation, and the pipeline in this PR is for apc 40 object recognition/segmentation, is that correct?

Maybe you are assuming that vgg network can recognize 1000 kinds of objects (general in terms of the number of kinds) in a various situations like on road, in a room, airport, and school (general in terms of the environment where the object is located), however there are some misunderstandings.

VGG network and FCN network is totally different networks for different tasks: former is for object recognition and latter is for segmentation. So it is required to copy the weights from VGG to FCN only for layers with same matrix size at fine-tuning process, and another dataset for object segmentation to transfer object recognition network (VGG) to object segmentation one (FCN). It is desirable if we have a dataset with large object classes, the dataset which original paper used has only 21 classes, however to my knowledge, there is no segmentation dataset for 1000 object classes, but 80 categories is the largest one.

k-okada commented 7 years ago

It is desirable if we have a dataset with large object classes, the dataset which original paper used has only 21 classes http://host.robots.ox.ac.uk/pascal/VOC/voc2012/, however to my knowledge, there is no segmentation dataset for 1000 object classes, but 80 categories is the largest one http://mscoco.org/.

I see and that's ok. I think it is still useful,

and also "object recognition" it self is useful too, because what you call "object segmentation" is "image segmentation" + "object recognition" and segmentation has been considered difficult problem in computer vision, however robot vision has been trying to solve this problem by active perception strategy, for example pick object and move to the front of eye, or push object and find moved pixels and so on.

—

You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/start-jsk/jsk_apc/issues/1865#issuecomment-237749165, or mute the thread https://github.com/notifications/unsubscribe-auth/AAeG3NfH2Yg6MFqqzLrhGYDqHj0gNIDxks5qcrPngaJpZM4Jb3jh .

wkentaro commented 7 years ago

I think it is still useful, and also "object recognition" it self is useful too, because what you call "object segmentation" is "image segmentation" + "object recognition" and segmentation has been considered difficult problem in computer vision, however robot vision has been trying to solve this problem by active perception strategy, for example pick object and move to the front of eye, or push object and find moved pixels and so on.

I see that general idea, and maybe we should discuss in other location like on issue in jsk_recognition, email or in person.

wkentaro commented 7 years ago

I'm assuming scene without shelf, so just look for object from all scene if no attention clipper is given,

it seems you're assuming use case a-1) with baxter and hrp2, and a-2) with pr2, but need to have attention clipper, if we can run pipeline with a-2 without attention clipper @h-kamada will show tabletop segmentation within 1 hour, and if he spend few more our, he can use a-2) with attention clipper, I think that's our goal.

Well, tasks without attention clipper already exists as below:

APC by Baxter (with attention clipper for bin boxes)
APC by HRP2 (with attention clipper for bin boxes)
Fetching a can from fridge by PR2 (without attention clipper)
Fetching a can from fridge by HRP2 (without attention clipper)

(I'm not sure why you are paying attention so much to the attention clipper)

wkentaro commented 7 years ago

or, set default value of attention clipper to 5mx5mx5m may ok

Do you mean I should add a 3D object segmentation launch file for general situation to somewhere like jsk_pcl_ros?

k-okada commented 7 years ago

using attention clipper is environment dependant, you how to know if there is an shelf/table before hand, except you put attention clipper on gripper.

◉ Kei Okada

On Fri, Aug 5, 2016 at 4:51 PM, Kentaro Wada notifications@github.com wrote:

I'm assuming scene without shelf, so just look for object from all scene if no attention clipper is given,

it seems you're assuming use case a-1) with baxter and hrp2, and a-2) with pr2, but need to have attention clipper, if we can run pipeline with a-2 without attention clipper @h-kamada https://github.com/h-kamada will show tabletop segmentation within 1 hour, and if he spend few more our, he can use a-2) with attention clipper, I think that's our goal.

Well, tasks without attention clipper already exists as below:

APC by Baxter (with attention clipper for bin boxes)

APC by HRP2 (with attention clipper for bin boxes)

Fetching a can from fridge by PR2 (without attention clipper)

Fetching a can from fridge by HRP2 (without attention clipper)

(I'm not sure why you are paying attention so much to the attention clipper)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/start-jsk/jsk_apc/issues/1865#issuecomment-237780572, or mute the thread https://github.com/notifications/unsubscribe-auth/AAeG3H7WosZbcv19UdkgJOq00Nv4e8QIks5qcuudgaJpZM4Jb3jh .

wkentaro commented 7 years ago

Well, I think the motion node and recognition pipeline are also environment dependent. So I'm not sure what is the benefit of removing attention clipper.

2016年8月8日月曜日、Kei Okadanotifications@github.comさんは書きました:

using attention clipper is environment dependant, you how to know if there is an shelf/table before hand, except you put attention clipper on gripper.

◉ Kei Okada

On Fri, Aug 5, 2016 at 4:51 PM, Kentaro Wada <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

I'm assuming scene without shelf, so just look for object from all scene if no attention clipper is given,

it seems you're assuming use case a-1) with baxter and hrp2, and a-2) with pr2, but need to have attention clipper, if we can run pipeline with a-2 without attention clipper @h-kamada https://github.com/h-kamada will show tabletop segmentation within 1 hour, and if he spend few more our, he can use a-2) with attention clipper, I think that's our goal.

Well, tasks without attention clipper already exists as below:

APC by Baxter (with attention clipper for bin boxes)

APC by HRP2 (with attention clipper for bin boxes)

Fetching a can from fridge by PR2 (without attention clipper)

Fetching a can from fridge by HRP2 (without attention clipper)

(I'm not sure why you are paying attention so much to the attention clipper)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/start-jsk/jsk_apc/issues/1865#issuecomment-237780572 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AAeG3H7WosZbcv19UdkgJOq00Nv4e8QIks5qcuudgaJpZM4Jb3jh .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/start-jsk/jsk_apc/issues/1865#issuecomment-238118082, or mute the thread https://github.com/notifications/unsubscribe-auth/AEHFk32fp-amCxJAfxJoxQ0DClbtnhDJks5qdnQMgaJpZM4Jb3jh .

和田健太郎 / Kentaro Wada http://wkentaro.com

start-jsk / jsk_apc

Proposal of new 3D object segmentation pipeline #1865