Open Seanseattle opened 5 years ago
I have improved F1 measure to 71% by change deteval. You can change tr=0.8 and tp=0.4 in deteval just as the paper Object count/Area Graphs for the Evaluation of Object Detection and Segmentation Algorithms said. In short, the Deteval.py has a bug.
@techkang , Can you make a pull request? This code(Deteval.py) is a borrow from Total-text dataset official GitHub repo, I have not realize where this error is, or you can send your code to me directly, thank you very much~
@princewang1994 you means just set tr=0.8 and tp=0.4? or it has other bugs?
@princewang1994 I tried to run my code. It seems that the result is caused by postprocessing. I have the code which is written by Long Shangbang, who is the first author of the TextSnake. The result is that when I set tp=tr=0.6, your result is similar to his, f1 score is around 60%. But when I set tp=0.4 and tr=0.8, his f1 score is about 70% and yours are 35%. This is strange and I can not find out reason. His code is not available for others now.
@techkang thanks for your working! I am keeping updating my code and up to now, in my latest version, if I set tp=tr=0.6, it gets Precision = 0.6434 - Recall = 0.7344 - Fscore = 0.6859, but as you say that if change tp to 0.8 and 0.4, the score drops down, I am also strange for this, can you share the code you runs to me? thanks!
@techkang Ok, I understand, thank you all the way!
I run some experiments and here is the result. These five experiments used the same model.
exp_name | backbone | input_size | iter | post process | eval | tp | tr | precision | recall | f1 |
---|---|---|---|---|---|---|---|---|---|---|
synth_res_big | resnet | 1024 | 52000 | wang | det | 0.6 | 0.6 | 0.671 | 0.644 | 0.657 |
synth_res_big | resnet | 1024 | 52000 | wang | det | 0.4 | 0.8 | 0.439 | 0.426 | 0.432 |
synth_res_big | resnet | 1024 | 52000 | long | long | 0.4 | 0.8 | 0.867 | 0.744 | 0.803 |
synth_res_big | resnet | 1024 | 52000 | long | det | 0.6 | 0.6 | 0.639 | 0.532 | 0.581 |
synth_res_big | resnet | 1024 | 52000 | long | det | 0.4 | 0.8 | 0.865 | 0.729 | 0.792 |
It seems that my post processing method doesn't work as well as the Long's, this is important to me, thanks for a lot!
I notice that in the eval
column, it has det
and long
, which are much different in precision, recall with the same tp and tr. So what's the difference between them?
Eval means eval script. Long writes his own eval script according to det eval paper.
@techkang I found that if I expand the boundary of text instances for a little(like 0.2 or 0.3 times of radii), the score in tp=0.4
and tr = 0.8
will be much improved, in structure with vgg as backbone, it obtains fscore=0.75
if I set expand=0.3
. Maybe the reason is tr=0.8
is a high requirement in recall. Did you check post-process of Long's code have similar boundary expanding operation?
@princewang1994 What do you mean expand the boundary of text instance? Do you just multiply radius by 1.3 at line 110 in detection.py or you change the boundary of ground truth when training you model?
@techkang expand radii just in inference phase
I'm sorry that I made a mistake about long's eval script. The correct p/r/f is 0.867/0.744/0.803, I modified the table above. Long said they expanded the boundary.
@princewang1994 Hello, I have run your code, however, the result I get is below, which is a gap between the original paper (78.4%). Do you have any suggestion. Thanks.
Config: tr: 0.8 - tp: 0.4 Precision = 0.8268 - Recall = 0.7023 - Fscore = 0.7595
@WeihongM ,Hi, did you run with text expanding trick in post-processing? This will greatly affect the performance. Try to set expanding rate to 0.3 in config. The score should be better.
Or you can use the pretrained model to comfirm everything is consistent with that on my machine.
I have trained this model, but i only got F1 measure 58% nearly. do you have any ideas to improve the results. thank you. by the way, could you tell me your result.