princewang1994 / TextSnake.pytorch

A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
https://arxiv.org/abs/1807.01544
MIT License
435 stars 92 forks source link

result of textsnake #19

Open Seanseattle opened 5 years ago

Seanseattle commented 5 years ago

I have trained this model, but i only got F1 measure 58% nearly. do you have any ideas to improve the results. thank you. by the way, could you tell me your result.

techkang commented 5 years ago

I have improved F1 measure to 71% by change deteval. You can change tr=0.8 and tp=0.4 in deteval just as the paper Object count/Area Graphs for the Evaluation of Object Detection and Segmentation Algorithms said. In short, the Deteval.py has a bug.

princewang1994 commented 5 years ago

@techkang , Can you make a pull request? This code(Deteval.py) is a borrow from Total-text dataset official GitHub repo, I have not realize where this error is, or you can send your code to me directly, thank you very much~

princewang1994 commented 5 years ago

@princewang1994 you means just set tr=0.8 and tp=0.4? or it has other bugs?

techkang commented 5 years ago

@princewang1994 I tried to run my code. It seems that the result is caused by postprocessing. I have the code which is written by Long Shangbang, who is the first author of the TextSnake. The result is that when I set tp=tr=0.6, your result is similar to his, f1 score is around 60%. But when I set tp=0.4 and tr=0.8, his f1 score is about 70% and yours are 35%. This is strange and I can not find out reason. His code is not available for others now.

princewang1994 commented 5 years ago

@techkang thanks for your working! I am keeping updating my code and up to now, in my latest version, if I set tp=tr=0.6, it gets Precision = 0.6434 - Recall = 0.7344 - Fscore = 0.6859, but as you say that if change tp to 0.8 and 0.4, the score drops down, I am also strange for this, can you share the code you runs to me? thanks!

princewang1994 commented 5 years ago

@techkang Ok, I understand, thank you all the way!

techkang commented 5 years ago

I run some experiments and here is the result. These five experiments used the same model.

exp_name backbone input_size iter post process eval tp tr precision recall f1
synth_res_big resnet 1024 52000 wang det 0.6 0.6 0.671 0.644 0.657
synth_res_big resnet 1024 52000 wang det 0.4 0.8 0.439 0.426 0.432
synth_res_big resnet 1024 52000 long long 0.4 0.8 0.867 0.744 0.803
synth_res_big resnet 1024 52000 long det 0.6 0.6 0.639 0.532 0.581
synth_res_big resnet 1024 52000 long det 0.4 0.8 0.865 0.729 0.792
princewang1994 commented 5 years ago

It seems that my post processing method doesn't work as well as the Long's, this is important to me, thanks for a lot!

princewang1994 commented 5 years ago

I notice that in the eval column, it has det and long, which are much different in precision, recall with the same tp and tr. So what's the difference between them?

techkang commented 5 years ago

Eval means eval script. Long writes his own eval script according to det eval paper.

princewang1994 commented 5 years ago

@techkang I found that if I expand the boundary of text instances for a little(like 0.2 or 0.3 times of radii), the score in tp=0.4 and tr = 0.8 will be much improved, in structure with vgg as backbone, it obtains fscore=0.75 if I set expand=0.3. Maybe the reason is tr=0.8 is a high requirement in recall. Did you check post-process of Long's code have similar boundary expanding operation?

techkang commented 5 years ago

@princewang1994 What do you mean expand the boundary of text instance? Do you just multiply radius by 1.3 at line 110 in detection.py or you change the boundary of ground truth when training you model?

princewang1994 commented 5 years ago

@techkang expand radii just in inference phase

techkang commented 5 years ago

I'm sorry that I made a mistake about long's eval script. The correct p/r/f is 0.867/0.744/0.803, I modified the table above. Long said they expanded the boundary.

WeihongM commented 4 years ago

@princewang1994 Hello, I have run your code, however, the result I get is below, which is a gap between the original paper (78.4%). Do you have any suggestion. Thanks.

Config: tr: 0.8 - tp: 0.4 Precision = 0.8268 - Recall = 0.7023 - Fscore = 0.7595

princewang1994 commented 4 years ago

@WeihongM ,Hi, did you run with text expanding trick in post-processing? This will greatly affect the performance. Try to set expanding rate to 0.3 in config. The score should be better.

Or you can use the pretrained model to comfirm everything is consistent with that on my machine.