MhLiao / MaskTextSpotter

A PyTorch implementation of Mask TextSpotter
https://github.com/MhLiao/MaskTextSpotter
413 stars 94 forks source link

Can the model run at 6.7 FPS as written in the paper? #24

Closed Laura1021678398 closed 5 years ago

Laura1021678398 commented 5 years ago

I ran test.sh and the speed was 1.47s/it...

MhLiao commented 5 years ago

The speed mainly depends on three aspects: the input scale of the images; the test dataset (number of text instances); the GPU performance. The default input size (MIN_SIZE_TEST) in the config file is 1000 so the speed should be about 3fps for the ICDAR 2015 dataset, with Titan Xp GPU. Note that the inference speed of PyTorch is slower than caffe2. But your speed is abnormal. Can you provide more information?

Laura1021678398 commented 5 years ago
  1. the default input size (MAX_SIZE_TEST) in finetune.yaml is 3333, the speed is 0.4s / img for the ICDAR 2015 dataset. The speed got slower(1.1 s / img) when I altered MAX_SIZE_TEST to 1000.
  2. I tested in my dataset ( 28 text instances per image; MAX_SIZE_TEST = 1000), the speed was 1.6s/img. Is this unnormal? 3.My gpu is GeForce GTX 1080 Ti. Thanks for answering.
MhLiao commented 5 years ago

@Laura1021678398 I am sorry that I made a mistake in the previous answer, the "MAX_SIZE_TEST" should be "MIN_SIZE_TEST". Usually, "MAX_SIZE_TEST" does not need to be changed. You said that the speed is 0.4s/img for the ICDAR 2015 dataset, but 1.47s/img for your dataset. I guess the main reason may be that your dataset contains more words. For each word, there are some post-processing steps that can not be parallel optimized. Removing the character segmentation branch in the inference period may be helpful to the inference speed.