facebookresearch / detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
https://detectron2.readthedocs.io/en/latest/
Apache License 2.0
29.48k stars 7.33k forks source link

How to save Detectron model as Vanilla Pytorch model? #4589

Open deshwalmahesh opened 1 year ago

deshwalmahesh commented 1 year ago

I have a Faster-RCNN model trained with Detectron2. Model weights are saved as model.pth.

I have my config.yml file and there are a couple of ways to load this model:

from detectron2.modeling import build_model
from detectron2.checkpoint import DetectionCheckpointer

cfg = get_cfg()
config_name = "config.yml" 
cfg.merge_from_file(config_name)

cfg.MODEL.WEIGHTS = './model.pth'
model = DefaultPredictor(cfg)

OR

model_ = build_model(cfg) 
model = DetectionCheckpointer(model_).load("./model.pth")

Also, you can get predictions from this model individually as given in official documentation:

image = np.array(Image.open('page4.jpg'))[:,:,::-1] # RGB to BGR format
tensor_image = torch.from_numpy(image.copy()).permute(2, 0, 1) # B, channels, W, H

with torch.no_grad():
    output = torch_model([{"image":tensor_image}])

running the following commands:

print(type(model))
print(type(model.model))
print(type(model.model.backbone))

Gives you:

<class 'detectron2.engine.defaults.DefaultPredictor'>
<class 'detectron2.modeling.meta_arch.rcnn.GeneralizedRCNN'>
<class 'detectron2.modeling.backbone.fpn.FPN'>

Problem: I want to use GradCam for model explainability and it uses pytorch models as given in this tutorial

How can I turn detectron2 model in vanilla pytorch model?

I have tried:

torch.save(model.model.state_dict(), "torch_weights.pth")
torch.save(model.model, "torch_model.pth")

from torchvision.models.detection import fasterrcnn_resnet50_fpn

dummy = fasterrcnn_resnet50_fpn(pretrained=False, num_classes=1)
# dummy.load_state_dict(torch.load('./model.pth', map_location = 'cpu')) 
dummy.load_state_dict(torch.load('./torch_weights.pth', map_location = 'cpu')) 

but obviously, I'm getting errors due to the different layer names and sizes etc.

mehi64 commented 1 year ago

Hi Mahesh. I have the same problem. Did you find any solution for that?

deshwalmahesh commented 1 year ago

@mehi64 Yes, actually the model we save is the pytorch model only. But it supports 1 image inference only. See the forward method


from detectron2.modeling import build_model
from detectron2.checkpoint import DetectionCheckpointer
from detectron2.config import get_cfg

BUILD CONFIG File

def build_cfg(model_weights:str, thresh:float = 0.6):

    cfg = get_cfg()

    config_name = "config.yml" # Using pre trained layout parser configs
    cfg.merge_from_file(config_name)

    cfg.MODEL.DEVICE = "cpu" if not torch.cuda.is_available() else "cuda"

    cfg.DATALOADER.NUM_WORKERS: 2
    cfg.TEST.EVAL_PERIOD = 20 # Evaluate after N epochs

    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 # Default 256 
    cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1 # in config file, it is written before weights

    cfg.MODEL.WEIGHTS = model_weights # layout parser Pre trained weights

    cfg.SOLVER.IMS_PER_BATCH = 4 # Batch size
    cfg.SOLVER.BASE_LR = 0.0025
    cfg.SOLVER.WARMUP_ITERS = 50
    cfg.SOLVER.MAX_ITER = 1000 # adjust up if val mAP is still rising, adjust down if overfit
    cfg.SOLVER.STEPS = (300, 800) # must be less than  MAX_ITER 
    cfg.SOLVER.GAMMA = 0.05
    cfg.SOLVER.CHECKPOINT_PERIOD = 20  # Save weights after these many epochs

    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = thresh
    return cfg

MODEL CLASS

class TorchModel(torch.nn.Module):
    def __init__(self, model_weights:str, thresh:float = 0.6, nms:float = 0.3) -> None:
        super().__init__()
        cfg = build_cfg(model_weights, thresh) # Own function in helpers.py to load weights
        self.model = build_model(cfg) # Build Model
        _ = DetectionCheckpointer(self.model).load(cfg.MODEL.WEIGHTS)  # Load weights
        self.model = self.model.eval() # In evaluation mode
        self.nms = nms

    def forward(self, INPUT):
        if isinstance(INPUT, (np.ndarray, torch.Tensor)): # it supports just 1 image
            INPUT = [{"image":INPUT}]

        with torch.no_grad():
            outputs = self.model(INPUT)[0]['instances']

        boxes, labels, scores = outputs.pred_boxes.tensor, outputs.pred_classes, outputs.scores.detach()

        nms_indices = nms(np.int32(boxes.numpy()), scores.numpy(), self.nms)

        return boxes[nms_indices], labels[nms_indices], scores[nms_indices]

def nms(dets, scores, thresh): # apply NMS as there's issue in single class  original
    x1 = dets[:, 0]
    y1 = dets[:, 1]
    x2 = dets[:, 2]
    y2 = dets[:, 3]
    # scores = dets[:, 4]

    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)
        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        ovr = inter / (areas[i] + areas[order[1:]] - inter)

        inds = np.where(ovr <= thresh)[0]
        order = order[inds + 1]

    return keep

This worked for me.

You can do the inference as:

WRAPPER = TorchModel("../data/column_model.pth", 0.6, nms = 0.3)

image = cv2.imread(IMAGE_PATH) # BGR Formar
float_image = np.float32(image/255.) #  Normalized BGR image between 0-1

tensor_image = torch.from_numpy(image.copy()).permute(2, 0, 1) # [B, channels, W, H]
tensor_boxes, labels, scores = WRAPPER(tensor_image)

Explainability using GRADCAM

targets = [FasterRCNNBoxScoreTarget(labels=labels, bounding_boxes=boxes)] # this not executed in EigenCam just init
target_layers = [WRAPPER.model.backbone]

cam = EigenCAM(WRAPPER,target_layers, use_cuda=torch.cuda.is_available(), reshape_transform=fasterrcnn_reshape_transform)

grayscale_cam = cam(tensor_image, targets=targets, eigen_smooth=True)

grayscale_cam = grayscale_cam[0, :] # Take the first image in the batch

cam_image = show_cam_on_image(float_image, grayscale_cam, use_rgb = False) 

RGB_image_with_bounding_boxes = draw_boxes( cam_image, boxes, labels, classes, COLORS) # without normalizing

Image.fromarray(RGB_image_with_bounding_boxes) # show Image with RGB format
judahkshitij commented 1 year ago

@deshwalmahesh @mehi64 I am kind of in the same boat. I have couple trained image segmentation models (MaskDINO and Mask2Former) that I trained using detectron2 framework and I have the final trained model as pth file (model_final.pth) obtained from training. But, now I want to run this model in a Docker container that only has Pytorch installed but not detectron2. Is there a way to take these models and run them inside pure Pytorch without requiring detectron2? Any help on this highly appreciated. Thanks.

deshwalmahesh commented 1 year ago

@judahkshitij You can load the Detectron .pth model just like any other model and run the inference. Wrap it up in nn.Module like I did which means access the model.forward or model.predict method. Did you try the code I have shown above?

rameshjes commented 11 months ago

Thanks, @deshwalmahesh for sharing the code. I have a question: the class (TorchModel) you've created is also importing functions from detectron2. Does this mean that during inference, we will also need detectron2?

RiccardoMaistri commented 10 months ago

Hello everyone, one year later

I have the same question of @rameshjes @deshwalmahesh It is possibile to load the model without using Detectron2 lib? My objective would be to perform training using Detectron2 and inferencing on edge without Detectron2 Is it feasible?

bjpietrzak commented 2 weeks ago

Sup everyone.

I have also looked for a fine solution to this problem. Your best bet will be to convert detectron2 model with their script created exactly for this case: detectron2/tools/deploy/export_model.py. Just convert detectron2 model to torchscript.

After the conversion this model will output results in a different format than for example GeneralizedRCNN, so you will have to look around in the source code and GitHub issues, to format the output in the same way as before the conversion.

Also, the resulting models can have some underlying issues. Mine for example takes up twice as much vram than before the conversion.

Converting detectron2 model to onnx is even more difficult.