Open zoelevin opened 1 year ago
Maybe not be a perfect solution, but you can run the evaluation script and get the results in the variables target
and results
.
At the bottom of evaluate()
in engine_multi.py
you can append the res variable to a list and return it.
def evaluate(model, criterion, postprocessors, data_loader, base_ds, device, output_dir):
...
extracted_results = list()
for samples, targets in metric_logger.log_every(data_loader, 10, header):
res = {target['image_id'].item(): output for target, output in zip(targets, results)}
...
extracted_results.append(res)
return stats, coco_evaluator, extracted_results
Then you would unpack an extra tuple item in main.py
and you could use pickle to save the results.
To actually view the video you could use open-cv to make a video from the images, utilizing the rectangle
function to draw the bounding boxes.
thanks @itbergl. I did as you suggested (eg printing the bounding boxes I got):
{654: {'scores': tensor( [0.2183, 0.1489, 0.1324, 0.1290, 0.0712, 0.0674, 0.0617, 0.0446, 0.0436, 0.0353, 0.0341, 0.0338, 0.0338, 0.0326, 0.0317, 0.0314], device='cuda:0'), 'labels': tensor([ 7, 17, 23, 19, 27, 27, 17, 7, 17, 7, 19, 23, 5, 2, 22, 17], device='cuda:0'), 'boxes': tensor([[ 1.7216e+00, 4.1063e+01, 2.6564e+01, 1.0398e+02], [ 4.5522e+02, 1.1981e+02, 5.5804e+02, 1.5068e+02], [ 4.5522e+02, 1.1981e+02, 5.5804e+02, 1.5068e+02], [ 4.3279e+01, 1.1582e+02, 8.4566e+01, 1.7838e+02]]], device='cuda:0')}}
However, the information about the frame associated with image id 654 is not included in that dictionary. How can I know which bb corresponds to each image_id 654 frames?
@adilsonmedronha maybe try accessing the .json file. You can do something like this to get the frame information:
with open('*.json', 'rb') as f:
data = json.load(f)
image_id_to_filename = {img["id"]: img["name"] for img in data["images"]}
That should give you a dictionary of image IDs to filenames.
I was wondering how we can run our own videos for object detection so that we can get video output with the bounding boxes and labels like shown in figure 9 of the connected paper? I saw something about how mmtracking has demo scripts, but I couldn't figure out how to use TransVOD similarly to get the results I need. Thank you.