Closed spatiallysaying closed 5 years ago
Finetuning an object detection model requires a slightly different training code, and the classification code that you used is not adapted.
Check the tutorial in https://colab.research.google.com/github/pytorch/vision/blob/temp-tutorial/tutorials/torchvision_finetuning_instance_segmentation.ipynb for learning how to finetune an instance segmentation model. An object detection model is very similar.
@fmassa , example is really helpful . I have ignored masks and got the bounding boxes for the 'PennFudanDataset' dataset given in the example. This example detects a single class 'person'. I want to extend this for multiple classes (custom dataset). Appreciate pointers in that direction.
Notice that I have accomplished this in Tensorflow using TF Record, model config file and .pbtxt albeit in a harder way. I am new to Pytorch and struggling to replicate the same.My impression is Pytrorch is much simpler.
In the torchvision example for multiclass classification ,dataset is organised for 'test' and 'val'. I have expected a similar data organization for object detection too. data\train\ class1\img1.jpg,img2.jpg..... class2\img1.jpg,img2.jpg.... data\val\ class1\img1.jpg,img2.jpg..... class2\img1.jpg,img2.jpg....
`` import os import numpy as np import torch import torch.utils.data from PIL import Image
class PennFudanDataset(torch.utils.data.Dataset): def init(self, root, transforms=None): self.root = root self.transforms = transforms
# ensure that they are aligned
self.imgs = list(sorted(os.listdir(os.path.join(root, "PNGImages"))))
self.masks = list(sorted(os.listdir(os.path.join(root, "PedMasks"))))
def __getitem__(self, idx):
# load images ad masks
img_path = os.path.join(self.root, "PNGImages", self.imgs[idx])
mask_path = os.path.join(self.root, "PedMasks", self.masks[idx])
img = Image.open(img_path).convert("RGB")
# note that we haven't converted the mask to RGB,
# because each color corresponds to a different instance
# with 0 being background
mask = Image.open(mask_path)
mask = np.array(mask)
# instances are encoded as different colors
obj_ids = np.unique(mask)
# first id is the background, so remove it
obj_ids = obj_ids[1:]
# split the color-encoded mask into a set
# of binary masks
masks = mask == obj_ids[:, None, None]
# get bounding box coordinates for each mask
num_objs = len(obj_ids)
boxes = []
for i in range(num_objs):
pos = np.where(masks[i])
xmin = np.min(pos[1])
xmax = np.max(pos[1])
ymin = np.min(pos[0])
ymax = np.max(pos[0])
boxes.append([xmin, ymin, xmax, ymax])
boxes = torch.as_tensor(boxes, dtype=torch.float32)
**# there is only one class
labels = torch.ones((num_objs,), dtype=torch.int64)**
masks = torch.as_tensor(masks, dtype=torch.uint8)
image_id = torch.tensor([idx])
area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
# suppose all instances are not crowd
iscrowd = torch.zeros((num_objs,), dtype=torch.int64)
target = {}
target["boxes"] = boxes
target["labels"] = labels
target["masks"] = masks
target["image_id"] = image_id
target["area"] = area
target["iscrowd"] = iscrowd
if self.transforms is not None:
img, target = self.transforms(img, target)
return img, target
def __len__(self):
return len(self.imgs)
Extending it to multiple classes should be a matter of changing the labels
in the dataset to represent the multiple classes you want (as numbers from 1 to the number of classes), and adding a larger num_classes
in the model.
This should be straightforward, but without further information it is hard to understand where you got blocked
🐛 Bug
Retraining the 'fasterrcnn_resnet50_fpn ' model for custom dataset is failing
To Reproduce
Steps to reproduce the behavior:
model_ft = models.resnet50(pretrained=True) with model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
Expected behavior
Object detection retrained for custom object detection similar to https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html
Environment
Google colab
` model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
our dataset has two classes only
num_classes = 2
in_features = model.roi_heads.box_predictor.cls_score.in_features
move model to the right device
model.to(device)
construct an optimizer
params = [p for p in model.parameters() if p.requires_grad] optimizer = torch.optim.SGD(params, lr=0.005, momentum=0.9, weight_decay=0.0005)
and a learning rate scheduler which decreases the learning rate by
10x every 3 epochs
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
loss_func = nn.NLLLoss()`
`def train_and_validate(model, loss_criterion, optimizer, epochs=25): ''' Function to train and validate Parameters :param model: Model to train and validate :param loss_criterion: Loss Criterion to minimize :param optimizer: Optimizer for computing gradients :param epochs: Number of epochs (default=25)