pytorch / android-demo-app

PyTorch android examples of usage in applications
1.46k stars 604 forks source link

i am making the .pt file for mask rcnn and masks always (1,28,28) #186

Open AhmedHessuin opened 2 years ago

AhmedHessuin commented 2 years ago

using d2go i modified the Wrapper to fit with my training data. everything is good, the boxes the labels, the scores, except the masks. i get masks in size (1,28,28) no matter how i change the input size how can i make the mask refer to the original image size ? and the results of the wrapped model, on mobile application not like on the PC, the result almost looks like a noise on PC and very good on the mobile application, what can be the reason ?

ashutoshsoni891 commented 2 years ago

how did you export the model ? Can you please share your Wrapper function @AhmedHessuin

tsubauaaa commented 2 years ago

I am in a similar situation. I also want to know how to shape 28 * 28 to fit the original image size. Here is my Wrapper Function.

from typing import List, Dict
import torch

class Wrapper(torch.nn.Module):
    def __init__(self, model):
        super().__init__()
        self.model = model
        coco_idx_list = [1]
        self.coco_idx = torch.tensor(coco_idx_list)

    def forward(self, inputs: List[torch.Tensor]):
        x = inputs[0].unsqueeze(0) * 255
        scale = 320.0 / min(x.shape[-2], x.shape[-1])
        x = torch.nn.functional.interpolate(x, scale_factor=scale, mode="bilinear", align_corners=True, recompute_scale_factor=True)
        out = self.model(x[0])
        res : Dict[str, torch.Tensor] = {}
        res["boxes"] = out[0] / scale
        res["labels"] = torch.index_select(self.coco_idx, 0, out[1])
        res["masks"] = out[2]
        res["scores"] = out[3]
        print(res["masks"])
        return inputs, [res]

orig_model = torch.jit.load(os.path.join(predictor_path, "model.jit"))
wrapped_model = Wrapper(orig_model)
scripted_model = torch.jit.script(wrapped_model)
scripted_model.save("d2go.pt")
AhmedHessuin commented 2 years ago

@tsubauaaa i fixed this problem, the 28x28xnumber_of_detected_labels masks refers to the boundary box itself, thus you can just post processing on the mask to map it to the bounding box then map it to the image steps 1- resize the mask to the bounding box size example bbox width =200, height= 300, the mask will be resized by 200 x 300 2- put the mask on the image at the coordinates of the bbox, bbox x1=200 y1=200 x2=400 y2= 500, the mask will be put on this coordinates on the original image

AhmedHessuin commented 2 years ago

@ashutoshsoni891 same method as @tsubauaaa

tsubauaaa commented 2 years ago

@AhmedHessuin Thanks to that I could understand !