Open Jacobew opened 5 years ago
In fact, the whole evaluation takes about 4 hours to finish. Any ideas? @fmassa
Hi @Jacobew
You are using a config file which hasn't been trained for detection (and this config in particular doesn't exist).
Can you try running the same thing using a model already trained for the detection task? For example, the one in configs/caffe2
folder, and report back?
Thanks for the reply! @fmassa
But sorry I don't get what you mean. The config file I used is right here in this project. And the yaml in configs/caffe2
folder seems to have no difference between the one I used except for MODEL.WEIGHT.
If you are using 'configs/e2e_mask_rcnn_R_50_FPN_1x.yaml' for testing without specifying new MODEL.WEIGHT
, it means you use an untrained model for inference.
So, you will get very bad detection results. It also will be very slow because there is some post-processing after CNN inference. For example, using some threshold to filter out low confidence predictions. In your case, because you didn't train the model and so the thresholding does not work properly.
If you just want to test the detection, you can use the script in config/caffe2
, in this case, the program will download the trained detection model from facebook server automatically and run the test.
Otherwise, you need to train the model first. Then use your trained model for inference by during adding MODEL.WEIGHT YOURMODEL
in your testing command.
Hi, @chengyangfu , thanks for the comment.
In fact, I've finetuned the model from MODEL.ZOO before I used this command, and the pretrained model will be loaded from last checkpoint when testing.
I see.
Can you run the inference with the model from MODEL.ZOO
or use the configs/caffe2
first? Just to make sure the slowness is not caused by the fine-tuning.
I agree with you. I'll try it and report back here later.
@Jacobew
And I found that the gpu usage is zero, this is quite weird.
Referring to this: Check if MODEL.DEVICE is set to "cuda" and not "cpu"
@maedmaex Hi, thanks for the comment.
I've checked it and MODEL.DEVICE is "cuda".
@Jacobew any updates?
@maedmaex Sorry for the late reply. I think it's because I added another branch that slows down the evaluation and when testing with 8 gpus, evaluation time reduces to within 10 minutes.
hi ,please tell me that how to slove the problem of evaluation is extremely slow? ? thanks @Jacobew and i also used same command .
@goodmellow hi, try testing with multiple gpus.
yeah, but Is there a way to increase speed on a GPU?? @Jacobew
@goodmellow I haven't find such a way yet. In fact the model in master branch tests good in my experiments if I add no more branches to it.
from what i understand, the bottleneck is actually the mask encoding in cocoApi, where it formats pixel-wise image to a running length encoded string, which is a representation that saves a lot space . if you do not intend to save your prediction result, you could dig into its API and work around encoding -> compute IOU matrix by your own direct implementation, which should be alot faster.
The better way to do this is to nag someone to make a cuda implementation of mask encoder for cocoAPI, which is what I am doing here ;p
@qianyizhang Thanks for the idea, would you please share your implementation?
i simply skip the mask evaluation completely, and assumes it would give ~2point drop.
I have been using WEIGHT: "https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_R_50_FPN_1x.pth" as my weights and it takes less than 20 minutes to run inference on all 5000 images. I don't have the exact time because I am running it on AzureML and I have to build the libraries each time I do my tests. I am also running on a single gpu with TEST.IMS_PER_BATCH: 10. I hope this helps.
❓ Questions and Help
Hi, I found that evaluation on coco2017 with 5000 images is extremely slow.
I haven't finished the evaluation process yet, but it seems that this would take about 3 hours to complete.
This is the command I used on 1 gpu,
And I found that the gpu usage is zero, this is quite weird.
I did not change the parameter MODEL.ROI_HEADS.DETECTIONS_PER_IMG. Could you help me figure it out?