orrzohar / PROB

[CVPR 2023] Official Pytorch code for PROB: Probabilistic Objectness for Open World Object Detection
Apache License 2.0
106 stars 12 forks source link

Some questions about Visualization for known and unknown objects #21

Closed luomingshuang closed 1 year ago

luomingshuang commented 1 year ago

Hi, @orrzohar , thanks for your great job. Here, I have some questions about visualization for known and unknown objects.

My codes for visualization as follows:


from torchvision.ops.boxes import batched_nms

### You can choose confidence: the default value of confidence is 0.7
def filter_boxes(scores, boxes, confidence=0.7, apply_nms=True, iou=0.5):
    keep = scores.max(-1).values > confidence
    scores, boxes = scores[keep], boxes[keep]

    if apply_nms:
        top_scores, labels = scores.max(-1)
        keep = batched_nms(boxes, top_scores, labels, iou)
        scores, boxes = scores[keep], boxes[keep]

    return scores, boxes

@torch.no_grad()
def viz(model, criterion, postprocessors, data_loader, base_ds, device, output_dir, args):
    dataset = args.dataset
    import numpy as np
    os.makedirs(output_dir, exist_ok=True)
    model.eval()
    criterion.eval()

    metric_logger = utils.MetricLogger(delimiter="  ")
    metric_logger.add_meter('class_error', utils.SmoothedValue(window_size=1, fmt='{value:.2f}'))

    use_topk = True
    num_obj = 20
    for batch_idx, (samples, targets) in enumerate(tqdm(data_loader)):
        if batch_idx >=10:
            break
        samples = samples.to(device)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
        top_k = len(targets[0]['boxes'])

        # outputs = model(samples)
        # indices = outputs['pred_logits'][0].softmax(-1)[..., 1].sort(descending=True)[1][:top_k]
        # predicted_boxes = torch.stack([outputs['pred_boxes'][0][i] for i in indices])
        # logits = torch.stack([outputs['pred_logits'][0][i] for i in indices])
        # scores_softmax = logits.softmax(-1)[:, :-1]
        # labels = scores_softmax.argmax(axis=1)
        # scores = scores_softmax.max(-1).values

        outputs = model(samples)
        # probas = outputs['pred_logits'].softmax(-1)[0, :, :-1].cpu()
        probas = outputs['pred_logits'].softmax(-1)[0, :, :].cpu()
        pred_objs = outputs['pred_obj'].softmax(-1)[0, :].cpu()
        predicted_boxes = outputs['pred_boxes'][0,].cpu()
        scores, predicted_boxes = filter_boxes(probas, predicted_boxes)
        labels = scores.argmax(axis=1)
        scores = scores.max(-1).values

        fig, ax = plt.subplots(1, 3, figsize=(10,3), dpi=200)

        # Ori Picture
        plot_ori_image(
            samples.tensors[0:1],
            ax[0], 
            plot_prob=False,
        )
        ax[0].set_title('Original Image')

        # Pred results
        # if not control the number of labels
        if not use_topk:
            plot_prediction(
                samples.tensors[0:1], 
                scores[-num_obj:], 
                predicted_boxes[-num_obj:], 
                labels[-num_obj:], 
                ax[1], 
                plot_prob=False,
                dataset=dataset,
            )
        # if control the number of labels
        if use_topk:
            plot_prediction(
                samples.tensors[0:1], 
                scores[-top_k:], 
                predicted_boxes[-top_k:], 
                labels[-top_k:], 
                ax[1], 
                plot_prob=False,
                dataset=dataset,
            )
        ax[1].set_title('Prediction (Ours)')

        # GT Results
        plot_prediction(
            samples.tensors[0:1], 
            torch.ones(targets[0]['boxes'].shape[0]), 
            targets[0]['boxes'], 
            targets[0]['labels'],
            ax[2], 
            plot_prob=False,
            dataset=dataset,
        )
        ax[2].set_title('GT')

        for i in range(3):
            ax[i].set_aspect('equal')
            ax[i].set_axis_off()

        plt.savefig(os.path.join(output_dir, f'img_{int(targets[0]["image_id"][0])}.jpg'))

There are some results for visualization: when I set probas = outputs['pred_logits'].softmax(-1)[0, :, :-1].cpu(), (In this case, there will be just known objects (because it just considers 0-79 known classes.)): e26d9e8d374a2cac3f2c8921bac7cf6 d7174a3087a046c61ea85014b714d6e

when I set probas = outputs['pred_logits'].softmax(-1)[0, :, :].cpu(), (In this case, there will be known and unknown objects (because it just considers 0-80 known classes.)): cb0a0b00cfe7adbe7690c9846ea0512 00f0524fe1f3d7b97af23f90ad75874

When I show the boxes of known and unknown objects in the picture, we can see that there are many overlap boxes on known objects and many boxes which are not objectness, so can you tell me how to modify the code and make it normal?

ae208gpu commented 1 year ago

@luomingshuang what is plot prediction function here ?

ae208gpu commented 1 year ago

Please could you share your inference script.

luomingshuang commented 1 year ago
def plot_prediction(image, scores, boxes, labels, ax=None, plot_prob=True, dataset='OWOD'):    
    if ax is None:
        ax = plt.gca()
    plot_results(image[0].permute(1, 2, 0).detach().cpu().numpy(), scores, boxes, labels, ax, plot_prob=plot_prob, dataset=dataset)

def plot_results(pil_img, scores, boxes, labels, ax, plot_prob=True, norm=True, dataset='OWOD'):
    from matplotlib import pyplot as plt
    h, w = pil_img.shape[:-1]
    # w, h = pil_img.shape[:-1]
    image = plot_image(ax, pil_img, norm)
    colors = COLORS * 100
    if boxes is not None:
        # boxes = [rescale_bboxes(boxes[i], [w, h]).cpu() for i in range(len(boxes))]
        for sc, cl, (xmin, ymin, xmax, ymax), c in zip(scores, labels, boxes, colors):
            ax.add_patch(plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin,
                                       fill=False, color=c, linewidth=2))

            text = f'{CLASSES[str(dataset)][cl]}: {sc:0.2f}'
            ax.text(xmin, ymin, text, fontsize=5, bbox=dict(facecolor='yellow', alpha=0.5))
    ax.grid('off')
luomingshuang commented 1 year ago

Add rescale_boxes to viz:

        probas = outputs['pred_logits'].softmax(-1)[0, :, :].cpu()
        predicted_boxes = outputs['pred_boxes'][0,].cpu()
        predicted_boxes = rescale_bboxes(predicted_boxes.cpu(), [w, h])
        scores, predicted_boxes = filter_boxes(probas, predicted_boxes)
        labels = scores.argmax(axis=1)
        scores = scores.max(-1).values
ae208gpu commented 1 year ago

Thank you so much @luomingshuang, really appreciate your quick response!

luomingshuang commented 1 year ago

Em, but I think there are still some errors in my above codes for visualizing.

ae208gpu commented 1 year ago

I am not able to generate the images that you have shown

ae208gpu commented 1 year ago

how were you able to get those images, do I need to add viz to plot_utils.py?

luomingshuang commented 1 year ago

You can have a reference about https://github.com/akshitac8/OW-DETR/blob/main/engine.py .

ae208gpu commented 1 year ago

Thanks @luomingshuang will check it out

orrzohar commented 1 year ago

Hi @luomingshuang,

I believe I rejected proposals with high overlap with 'known' object predictions - it was my observation that known object were predicted with relatively high accuracy. Please note that you may lose some unknown object predictions in this case. Please note that the visualizations in the figures were not generated this way, but relied on the GT objects (See Issue #11).

If this is still an issue, please reopen and let me know what you need help with,

Best, Orr

YH-2023 commented 11 months ago

@ae208gpu @luomingshuang Did you solve it? Can you share the visualization code?

simranbajaj06 commented 1 month ago

Do we need to change the content of EVAL_M_OWOD_BENCHMARK.sh ?

I am getting this

Initialized from the pre-training model

0%| | 0/5123 [00:00