Closed Laura1021678398 closed 5 years ago
The speed mainly depends on three aspects: the input scale of the images; the test dataset (number of text instances); the GPU performance. The default input size (MIN_SIZE_TEST) in the config file is 1000 so the speed should be about 3fps for the ICDAR 2015 dataset, with Titan Xp GPU. Note that the inference speed of PyTorch is slower than caffe2. But your speed is abnormal. Can you provide more information?
@Laura1021678398 I am sorry that I made a mistake in the previous answer, the "MAX_SIZE_TEST" should be "MIN_SIZE_TEST". Usually, "MAX_SIZE_TEST" does not need to be changed. You said that the speed is 0.4s/img for the ICDAR 2015 dataset, but 1.47s/img for your dataset. I guess the main reason may be that your dataset contains more words. For each word, there are some post-processing steps that can not be parallel optimized. Removing the character segmentation branch in the inference period may be helpful to the inference speed.
I ran test.sh and the speed was 1.47s/it...