Given config file for SSISv2 leads to RuntimeError and discrepancy on evaluation results

Hi authors, great work on the update for SSISv2! I have two questions regarding the repo:

Q1. I followed all instructions on your README and I managed to run SSIS. When I tried running SSISv2 with the given config file, it leads to the following runtime error:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/home/lfiguero/detectron2/detectron2/engine/launch.py", line 126, in _distributed_worker
    main_func(*args)
  File "/home/lfiguero/SSIS/tools/train_net.py", line 261, in main
    res = Trainer.test(cfg, model) # d2 defaults.py
  File "/home/lfiguero/SSIS/tools/train_net.py", line 203, in test
    results_i,association_i = inference_on_dataset(model, data_loader, evaluator)
  File "/home/lfiguero/detectron2/detectron2/evaluation/evaluator.py", line 158, in inference_on_dataset
    outputs = model(inputs)
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/lfiguero/SSIS/adet/modeling/ssis/condinst.py", line 110, in forward
    pred_instances_w_masks = self._forward_mask_heads_test(proposals, mask_feats)
  File "/home/lfiguero/SSIS/adet/modeling/ssis/condinst.py", line 164, in _forward_mask_heads_test
    pred_instances_w_masks = self.mask_head(
  File "/home/lfiguero/SSIS/adet/modeling/ssis/dynamic_mask_head.py", line 417, in __call__
    mask_scores,asso_mask_scores,  mask_iou, asso_mask_iou,_,_= self.mask_heads_forward_with_coords(
  File "/home/lfiguero/SSIS/adet/modeling/ssis/dynamic_mask_head.py", line 298, in mask_heads_forward_with_coords
    mask_iou = self.maskiou_head((mask_logits.sigmoid()>0.5).float(),mask_feats[im_inds].reshape(n_inst, self.in_channels, H , W))
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/lfiguero/SSIS/adet/modeling/ssis/dynamic_mask_head.py", line 139, in forward
    x = self.conv1x1_1(x)
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 447, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [2, 9, 3, 3], expected input[3, 10, 136, 100] to have 9 channels, but got 10 channels instead

What should I change to run SSISv2?

Q2. When evaluating SSIS with your instructions on the updated SOBA val annotations, I get the following results:

loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
segmentaion:
Running per image evaluation...
Evaluate annotation type *segm*
DONE (t=2.93s).
Accumulating evaluation results...
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.299
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.620
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.247
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.156
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.372
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.372
bbox:
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=1.66s).
Accumulating evaluation results...
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.268
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.592
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.221
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.133
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.347
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.347
--------------
Running per image evaluation...
Evaluate annotation type *segm*
DONE (t=0.21s).
Accumulating evaluation results...
DONE (t=0.03s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.523
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.733
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.612
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.121
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.403
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.640
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.210
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.581
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.581
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.124
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.446
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.713
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.17s).
Accumulating evaluation results...
DONE (t=0.03s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.592
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.762
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.638
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.183
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.497
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.700
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.229
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.651
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.651
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.185
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.543
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.768

In your paper, you report 30.2 and 27.1 for SOAPsegm and SOAPbbox respectively for SSIS on the SOBA test set, but I can't replicate the results using your instructions. What may be the discrepancy?

stevewongv / SSIS

Given config file for SSISv2 leads to RuntimeError and discrepancy on evaluation results #5