facebookresearch / detr

End-to-End Object Detection with Transformers
Apache License 2.0
13.08k stars 2.37k forks source link

Sigmoid in bounding boxes prediction? #587

Open ahorlbeck opened 1 year ago

ahorlbeck commented 1 year ago

Working with fake labels for the bounding boxes like values in the range of negative values or positive values > 1, all predicted bounding box coordinates are in (0,1).

Why is that?

Following the code for the FFN for the bounding boxes:

class MLP(nn.Module): """ Very simple multi-layer perceptron (also called FFN)"""

  def __init__(self, input_dim, hidden_dim, output_dim, num_layers):
      super().__init__()
      self.num_layers = num_layers
      h = [hidden_dim] * (num_layers - 1)
      self.layers = nn.ModuleList(nn.Linear(n, k) for n, k in zip([input_dim] + h, h + [output_dim]))

  def forward(self, x):
      for i, layer in enumerate(self.layers):
          x = F.relu(layer(x)) if i < self.num_layers - 1 else layer(x)
      return x

there should be only a linear layer as the last one?

Is there a sigmoid missing?

Thx!

JeavanCode commented 7 months ago

The sigmoid is in the detr class after calling MLP module.(line 68 in models/detr)

ahorlbeck commented 6 months ago

Thank you!