jwyang / faster-rcnn.pytorch

A faster pytorch implementation of faster r-cnn
MIT License
7.67k stars 2.33k forks source link

use faster_rcnn as module in own model #129

Closed ahmed-shariff closed 6 years ago

ahmed-shariff commented 6 years ago

I am in the process of trying to include the modules in faster-rcnn.pytorch as modules in my own pipeline to train on my custom dataset. So far I have managed to train the model. Currently working on the evaluation and demonstrations. I had to work through the code to figure out how some parts work, and I still have a few components to make sense of. Was wondering, perhaps you can add some documentations on the inputs and outputs of the modules to help with anyone trying to use them as part of their own model?

jwyang commented 6 years ago

@ahmed-shariff thanks for your suggestions! Beyond the document, if you have any specific questions about the code, just post here, I am glad to help you out.

ahmed-shariff commented 6 years ago

I don't have anything specific for now, will keep you posted. I am trying to implement mAP to use on my dataset, will let you know how it goes.

jwyang commented 6 years ago

I will close this issue for now. Post if you have any specific issue.

ahmed-shariff commented 6 years ago

I did implement the mAP from the voc_eval.py so that when i am testing on a custom dataset, there is no need to convert the dataset to voc format. It works on my dataset, but I am still unable to compare it with your implementation as i ran into the issue: #142. I'll post it here, perhaps someone can give some input on it.

self.model.eval()
#list of classes
class_idexes = [1]
results = [None]+[[] for idx in class_idexes]

for idx, i in tqdm(enumerate(input_fn)):
    if self.use_cuda:
        i = [i_.cuda() for i_ in i]

    input_var = torch.autograd.Variable(i[0])
    nmbox_var = torch.autograd.Variable(i[2])
    iinfo_var = torch.autograd.Variable(i[3])
    gtbox_var = torch.autograd.Variable(i[4])

    rois, cls_prob, bbox_pred, \
      rpn_loss_cls, rpn_loss_bbox, \
      RCNN_loss_cls, RCNN_loss_bbox, \
      rois_label = self.model(input_var, iinfo_var, gtbox_var, nmbox_var)

    scores = cls_prob.data
    boxes = rois.data[:, :, 1:5]
    pred_boxes = bbox_transform_inv(boxes, bbox_pred.data,1)
    pred_boxes = clip_boxes(pred_boxes, iinfo_var.data, 1)

    scores = scores.squeeze()
    scores_max, scores_max_index = scores.max(1)
    pred_boxes = pred_boxes.squeeze()

    class_threshold = 0.05

    for class_idx in class_idexes:
        candidate_indexes = torch.nonzero(scores_max_index.eq(class_idx)).view(-1)

        if candidate_indexes.numel() > 0:
            candidate_scores = scores[:,class_idx][candidate_indexes]
            _, candidate_scores_order = torch.sort(candidate_scores, 0, True)

            candidate_bboxes = pred_boxes[candidate_indexes][:,
                                                             class_idx*4 :
                                                             (class_idx + 1)*4]
            candidate_entries = torch.cat((candidate_scores.unsqueeze(1),
                                           candidate_bboxes), 1)
            candidate_entries_sorted = candidate_entries[candidate_scores_order]
            post_nms_candidate_indexes = nms(candidate_entries_sorted, 0.3)
            final_candidate_entries = candidate_entries_sorted[
                post_nms_candidate_indexes.view(-1).long()]
            final_candidate_bboxes = final_candidate_entries[:,1:]
            iou_s = []
            gtbox_class = []
            for gtbox in gtbox_var[0]:
                # Makes no sense to continue if the current gtbox
                # pertains to the class in consideration
                if not gtbox[4].eq(class_idx):
                    continue

                gtbox_class.append(1)
                intersection_x1 = torch.max(final_candidate_bboxes[:, 0], gtbox[0].data)
                intersection_y1 = torch.max(final_candidate_bboxes[:, 1], gtbox[1].data)
                intersection_x2 = torch.min(final_candidate_bboxes[:, 2], gtbox[2].data)
                intersection_y2 = torch.min(final_candidate_bboxes[:, 3], gtbox[3].data)
                intersection_width = intersection_x2 - intersection_x1
                intersection_height = intersection_y2 - intersection_y1
                intersection_valid_index = torch.nonzero(torch.mul(
                    intersection_width >= 0,
                    intersection_height >= 0)).view(-1)

                if intersection_valid_index.numel() > 0:
                    intersection = intersection_height[intersection_valid_index] * \
                                   intersection_width[intersection_valid_index]
                    pred_boxes_area = (
                        final_candidate_bboxes[:,2][intersection_valid_index] -
                        final_candidate_bboxes[:,0][intersection_valid_index]) * \
                        (final_candidate_bboxes[:,3][intersection_valid_index] -
                         final_candidate_bboxes[:,1][intersection_valid_index])
                    gtbox_area = (gtbox[2] - gtbox[0]) * (gtbox[3] - gtbox[1])
                    union = pred_boxes_area + gtbox_area.data - intersection
                    iou = intersection / union

                    # as all final candidate bboxes are to be considered, scatter the values
                    # calculated for the bboxes that can be true positive, everyother will be 0
                    out = torch.zeros_like(final_candidate_entries[:,0])
                    iou_s.append(out.scatter_(0,
                                              intersection_valid_index,
                                              iou).unsqueeze(1))
                else:
                    iou_s.append(
                        torch.zeros_like(
                            final_candidate_entries[:,0]).unsqueeze(1))

            iou_s = torch.cat(iou_s, 1)
            iou_s_max_val, iou_s_max_idx = iou_s.max(1)
            tp_out = torch.zeros_like(final_candidate_entries[:,0])

            # for each gtbox only on predicted bbox can be considered as true positive,
            # others are considered false positive
            for gt_idx in range(len(gtbox_class)):
                iou_s_gtbox_idx = torch.nonzero(iou_s_max_idx.eq(gt_idx)).view(-1)
                iou_s_gtbox_threshed_idx = torch.nonzero(
                    iou_s_max_val[iou_s_gtbox_idx].ge(0.5)).view(-1)
                if torch.numel(iou_s_gtbox_threshed_idx) > 0:
                    tp_out[iou_s_gtbox_idx[iou_s_gtbox_threshed_idx[0]]] = 1

            #results gets the tp and score of each bbox
            results[class_idx].append(
                torch.cat((final_candidate_entries[:,0].unsqueeze(1),
                           tp_out.unsqueeze(1)),
                          1).cpu())

ap_all = {}
for class_idx in class_idexes:
    try:
        # As each class in the results is going to have a list of results,
        # concat them to a single list
        results_concatenated = torch.cat(results[class_idx], 0)

        tp = results_concatenated[:, 1]
        fp = torch.abs(torch.neg(tp -1))
        _, sorted_order = torch.sort(results_concatenated[:,0], 0, True)
        tp = torch.cumsum(tp[sorted_order], 0)
        fp = torch.cumsum(fp[sorted_order], 0)

        # Calculate recall and precision and append the sentinal values
        recall = torch.cat((torch.zeros(1),
                            tp/tp.size(0),
                            torch.ones(1)),
                           0)
        precision = torch.cat((torch.zeros(1),
                               tp/torch.max(tp+fp, torch.Tensor([0.00001])),
                               torch.zeros(1)),
                              0)

        ap_index = torch.cumsum(torch.ones(recall.size(0)-2),0).long()
        #print((recall[ap_index] - recall[ap_index-1])*precision[ap_index])
        ap = torch.sum((recall[ap_index] - recall[ap_index-1])*precision[ap_index])
    except:
        ap = 0
    ap_all[class_idx] = ap
    #print(recall.numpy(), precision.numpy(), ap)
    #out_string_step = "Step: {}".format(idx)

    self.log("AP for class {}: {}".format(class_idx, ap))

    return ap_all
kentaroy47 commented 6 years ago

@ahmed-shariff I'm having trouble evaluating my dataset too. Can you comment further on how to implement (where?) this code? Thanks!

ahmed-shariff commented 6 years ago

plug this in anywhere you want to run the evaluation. Just replace the input_fn with the Dataset object containing the data you are evaluating on. Last I checked, the code I posted here worked, though, I had to find a machine with more than 2GB GPU memory. Also keep in mind to check the format of the data being passed.

kentaroy47 commented 6 years ago

Thanks! I'll try it out.

Ken

On Sep 6, 2018, at 21:16, Ahmed Shariff notifications@github.com<mailto:notifications@github.com> wrote:

plug this in anywhere you want to run the evaluation. Just replace the input_fn with the Dataset object containing the data you are evaluating on. Last I checked, the code I posted here worked, though, I had to find a machine with more than 2GB GPU memory. Also keep in mind to check the format of the data being passed.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/jwyang/faster-rcnn.pytorch/issues/129#issuecomment-419317600, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AlxuJUk4rzg7FNbIA8FK3AwWp1zUHnepks5uYfMrgaJpZM4TX4UE.