google-research / pix2seq

Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
Apache License 2.0
857 stars 71 forks source link

About sequence formulation for instance segmentation #16

Open volgachen opened 2 years ago

volgachen commented 2 years ago

Excuse me, I am interested in Pix2Seq, and trying to better understand it. I wonder how instance segmentation targets are formulated for training. To be more specific,

It would be nice if you can provide these details at your convenience. Thank you!

chentingpc commented 2 years ago

How to convert coco annotations into target sequences? (especially those with more than one polygon)

we use a separator to indicate different polygon.s

Do the starting point and direction matter?

we randomly pick the starting point, and follow the same direction as in the annotation.

Is there any design handling the varying length of target sequences?

We use ending token to indicate the end of the sequence. We simply truncate the sequence if it turns out to be longer than predefined max seq len (which is rare and happens when the annotation is very fine-grained).

Hope this helps.

volgachen commented 2 years ago

Thank you for detailed response. I have got it now! It's really surprising that such a simple solution achieves these good results.

gg22mm commented 10 months ago

How to do it, in more detail?