zhengziqiang / CoralSCOP

The official repository of "CoralSCOP: Segment any COral Image on this Planet". [CVPR Highlight 2024]
Other
3 stars 2 forks source link

Slow inference with CoralSCOP #3

Open SaiAakash opened 2 weeks ago

SaiAakash commented 2 weeks ago

Hi,

I was able to use vanilla SAM from Ultralytics to predict on a set of images which gave me an inference time of ~10s per image with a segment everything prompt (i.e no bounding box or point prompt).

When I do the same with CoralSCOP on the same GPU it takes close to ~40s per image for making predictions. I was just wondering why CoralSCOP is so slow ? Am I missing enabling some setting somewhere in the code that can make this faster ?

FYI, I am running this on an NVIDA RTX 2060 6GB GPU.

Any insights will be appreciated, thanks !

zhengziqiang commented 2 weeks ago

@SaiAakash Hello, SaiAakash, this is because that we used a lower predicted IoU (0.62 or 0.82) threshold to generate the coral reef masks. It means that the model will generate more tasks.

The SAM is using 0.92 (as default I assumed), so it will filter out those masks. If you select a lower score for SAM, it will also generate more masks with more inference time.

SaiAakash commented 2 weeks ago

I see. Thanks @zhengziqiang !

SaiAakash commented 2 weeks ago

I have another question. The inference time does not reduce if I increase the iou_threshold. It just stays the same way. From what you said, the inference speed should increase if I reduce the iou_threshold. But it doesn't. Do you have any thoughts on that ?

The commands I ran were (with modified iou_threshold values):

python test.py --model_type vit_b --checkpoint_path ./checkpoints/vit_b_coralscop.pth --iou_threshold 0.92 --sta_threshold 0.62 --test_img_path ./sample_imgs/ --output_path ./sample_imgs_output --gpu 0 --point_num
ber 64
python test.py --model_type vit_b --checkpoint_path ./checkpoints/vit_b_coralscop.pth --iou_threshold 0.65 --sta_threshold 0.62 --test_img_path ./sample_imgs/ --output_path ./sample_imgs_output --gpu 0 --point_num
ber 64

Both these commands have a runtime of ~400 s for 10 images.

zhengziqiang commented 1 week ago

I did a comparison under the same setting (ViT-B backbone) based on my single GTX3090 GPU with the same 20 images.

The average inference time (including generating masks and saving the corresponding JSON files:

SAM: 3.2176s for one image

CoralSCOP: 3.20123s for one image

I feel you are doing the comparison not under the same comparison setting or vanilla SAM from Ultralytics used some speed-up techniques. Could you please provide more information regarding this?