Convert detectron2 to other frameworks - different outputs

lwenaeven commented 5 months ago

I have a custom trained detectron2 model for instance segmentation that I load and use for inference as below:

cfg = get_cfg()
cfg.merge_from_file(config_path)
cfg.MODEL.WEIGHTS = weights_path
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7
cfg.MODEL.DEVICE='cpu'

predictor = DefaultPredictor(cfg)
outputs = predictor(im)

I have followed this tutorial from OpenVino in order to convert it to an OV model:

model = build_model(cfg)
DetectionCheckpointer(model).load(cfg.MODEL.WEIGHTS)
model.eval()

ov_model = convert_detectron2_model(model, im)

core = ov.Core()
ov_model = core.read_model("../model/model.xml")
compiled_model = ov.compile_model(ov_model)
results = compiled_model(sample_input[0]["image"])

However, I don't obtain the expected result from the complied ov model, ie. I normally should obtain one instance in the outputs and this is the case when using the classic detectron2 model, but when using the complied ov model, I don't get any instance detected.

Instances(num_instances=0, image_height=3024, image_width=4032, fields=[pred_boxes: Boxes(tensor([], size=(0, 4))), scores: [], pred_classes: [], pred_masks: tensor([], size=(0, 3024, 4032), dtype=torch.bool)])

Here is the information about the OV model:

<Model: 'Model4'
inputs[
<ConstOutput: names[args] shape[?,?,?] type: u8>
]
outputs[
<ConstOutput: names[tensor, 2193, 2195, 2190, 2188, 2172, 2167] shape[..100,4] type: f32>,
<ConstOutput: names[] shape[..100] type: i64>,
<ConstOutput: names[] shape[?,1,28,28] type: f32>,
<ConstOutput: names[] shape[..100] type: f32>,
<ConstOutput: names[image_size] shape[2] type: i64>
]>

UPDATE: As I really need to convert it to another framework in order to use model serving tools, I also tried to convert it to Torchscript. However, here again I have an empty output.

This time I followed the tutorial in this issue:

def inference_func(model, image):
    inputs= [{"image": image}]
    return model.inference(inputs, do_postprocess=False)[0]

wrapper= TracingAdapter(model, sample_input[0]["image"], inference_func)
wrapper.eval()

traced_script_module= torch.jit.trace(wrapper, (sample_input[0]["image"],))
traced_script_module.save("torchscript.pt")

The conversion is then done after a LOT of warnings, and here is a snippet of the resulting model:

RecursiveScriptModule(
  original_name=TracingAdapter
  (model): RecursiveScriptModule(
    original_name=GeneralizedRCNN
    (backbone): RecursiveScriptModule(
      original_name=FPN
      (fpn_lateral2): RecursiveScriptModule(original_name=Conv2d)
      (fpn_output2): RecursiveScriptModule(original_name=Conv2d)
      (fpn_lateral3): RecursiveScriptModule(original_name=Conv2d)
      (fpn_output3): RecursiveScriptModule(original_name=Conv2d)
      (fpn_lateral4): RecursiveScriptModule(original_name=Conv2d)
      (fpn_output4): RecursiveScriptModule(original_name=Conv2d)
      (fpn_lateral5): RecursiveScriptModule(original_name=Conv2d)
      (fpn_output5): RecursiveScriptModule(original_name=Conv2d)
      (top_block): RecursiveScriptModule(original_name=LastLevelMaxPool)
      (bottom_up): RecursiveScriptModule(
        original_name=ResNet
        (stem): RecursiveScriptModule(
          original_name=BasicStem
          (conv1): RecursiveScriptModule(
            original_name=Conv2d
            (norm): RecursiveScriptModule(original_name=FrozenBatchNorm2d)
          )
...

I then proceed the inference:

outputs = torchscript_model(sample_input[0]["image"])

And the resulting outputs are empty:

(tensor([], size=(0, 4), grad_fn=<ViewBackward0>),
 tensor([], dtype=torch.int64),
 tensor([], size=(0, 1, 28, 28), grad_fn=<SplitWithSizesBackward0>),
 tensor([], grad_fn=<IndexBackward0>),
 tensor([3024, 4032]))

I also have tried to export the model directly using the detectron2 functions documented here, but again I have the same issue in the resulting model's inference.

python ../detectron2-main/tools/deploy/export_model.py  --config-file ../detectron2-main/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --output torchscript_model --format torchscript --sample-image ../data/test.JPG --export-method tracing MODEL.DEVICE cpu MODEL.WEIGHTS ../model/custom_trained_model.pth

I'll take any idea!

github-actions[bot] commented 5 months ago

You've chosen to report an unexpected problem or bug. Unless you already know the root cause of it, please include details about it by filling the issue template. The following information is missing: "Instructions To Reproduce the Issue and Full Logs"; "Your Environment";

Programmer-RD-AI commented 5 months ago

Hi, The most probable cause of this is the conversion and inference processes, and ensuring that the input pre-processing and output post-processing match the original model's expectations... This is something you could look into :)

Hope this helps

Best regards, Ranuga Disansa

facebookresearch / detectron2

Convert detectron2 to other frameworks - different outputs #5308