SHI-Labs / OneFormer

OneFormer: One Transformer to Rule Universal Image Segmentation, arxiv 2022 / CVPR 2023
https://praeclarumjj3.github.io/oneformer
MIT License
1.39k stars 129 forks source link

Potential code bug but always good model #110

Open huydung179 opened 6 months ago

huydung179 commented 6 months ago

Hi,

I found a code bug. Can you verify it?

In the oneformer_transformer_decoder.py file, line 432:

feats = self.pe_layer(mask_features, None)

out_t, _ = self.class_transformer(feats, None, 
                            self.query_embed.weight[:-1], 
                            self.class_input_proj(mask_features),
                            tasks if self.use_task_norm else None)

I think you used the positional embedding as the source features. The forward method of that self.class_transformer is:

def forward(self, src, mask, query_embed, pos_embed, task_token=None):
...

I think it should be:

feats = self.class_input_proj(mask_features)

out_t, _ = self.class_transformer(feats, None, 
                            self.query_embed.weight[:-1], 
                            self.pe_layer(mask_features, None),
                            tasks if self.use_task_norm else None)