Open junxnone opened 3 years ago
先验知识
proposals/anchors/window centers
自回归模型
并行解码的 Transformer
N
+ no objects
+ Fixed Positional Encoding
+ output positional encoding
object queries
-- |
class_embed = nn.Linear(hidden_dim, num_classes + 1) bbox_embed = MLP(hidden_dim, hidden_dim, 4, 3)
Decoder Outputs
Mask-Head
Binary mask
Mask Head
2
Prob > 0.9 -- |
Faster RCNN
junxnone/tech-io#913
DETR
先验知识
proposals/anchors/window centers
将问题转换为回归和分类问题自回归模型
->并行解码的 Transformer
N
的预测集合+ no objects
Arch
+ Fixed Positional Encoding
+ output positional encoding
即object queries
-- |
Set Prediction loss
Positional Encoding
Detection Output/Prediction FFNs
Segmentation
Decoder Outputs
==>Mask-Head
Binary mask
for BBoxMask Head
2
种方法更快Prob > 0.9 -- |
Evaluation
Faster RCNN
系列Reference