HumanZhong / PAN-reimplement

This is an unofficial pytorch re-implementation of "Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network"(19'ICCV,PAN)
2 stars 0 forks source link

bugs details and test result #1

Open shenlei1020 opened 4 years ago

shenlei1020 commented 4 years ago

Hello, thanks for your code. as you mentioned in readme, could you please give the details of bugs you fixed in PAN.pytoch repo? and could you please give the test metrics? Thanks very much!

HumanZhong commented 4 years ago

@shenlei1020 Hello and sorry for replying so late.

Your first question. There are some differences between PAN.pytorch and the paper, mainly in the following three aspects. 
(a) training masks generated by the model (this is a relatively big difference and may lead to a worse result);
(b) how to generate the kernel for a text instance(slightly different, almost neglectable);
(c) data augmentation strategy and postprocessing hyper-parameter settings.
Because the paper also didn't provide the detailed data augmentation strategy and postprocessing hyper-parameter settings, I'm not sure whether the (c) is a bug or not. 
But I'm sure that fixing (a) and following the same training mask generation strategy as the PAN paper can make some help(about 0.4-0.7 f-score higher, sorry I can't remember the detailed number).

Note: the training mask generation strategy is different between PSENet and PAN, PAN.pytorch apply the same strategy as PSENet but PAN have some differences. You can check the paper for detailed info.

Your second question. I'm afraid I can't provide a detailed test metrics immediately because I lost my training and testing logs on the GPU server(someone deleted them, sad story). But as far as I can recollect, this repo is slightly worse than the paper(a gap of ~0.4-0.5 f-score in ICDAR2015). I will re-run the experiments as soon as possible.

Hope this can help you and further issues are appreciated if you have any questions.