princewang1994 / TextSnake.pytorch

A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
https://arxiv.org/abs/1807.01544
MIT License
437 stars 92 forks source link

Is it possible to get good results without pretraining? #12

Open sherif7810 opened 5 years ago

sherif7810 commented 5 years ago

I over-fit on a small sample of TotalText, until I got a loss of about 0.1 in 6 images, but the demo file with the trained checkpoint doesn't show any contours. 0_img11 1_img12

sherif7810 commented 5 years ago

Reducing thresholds adds red dots to output with the runtime of demo increased a lot.

princewang1994 commented 5 years ago

It seems that model is underfitting but not overfitting. The loss should be smaller than 0.1. For small dataset overfit, you can try smaller learning rate and training more epochs.

sherif7810 commented 5 years ago

Do you know a configuration (Number of samples, batch size, epochs, learning rate, etc) that is guaranteed to overfit successfully? I can't increase batch size to 4.

sherif7810 commented 5 years ago

Re-downloading the repository fixed the issue. It seems I broke something.

sherif7810 commented 5 years ago

You can close the issue. Thanks for help.

dongzhi0312 commented 5 years ago

为什么输出的识别结果图片好像被重复了一次 0_img91

princewang1994 commented 5 years ago

@dongzhi0312 这个地方代码中有点小问题,在demo.py里面这个部分,这了predict和gt使用了相同的coutour变量,其实下面那张图片应该是使用meta变量中的坐标点的,你发的这张图片的上下两个应该都是predict,我会在接下来的版本中改正~

dongzhi0312 commented 5 years ago

把这段代码改写成下面这样,解决了图像重复一次的问题 demo.py 中的85 86 行注释,88行改为 cv2.imwrite(path, predict_vis) 解决了图像重复的问题。图像重复问题由 86的 np.concatenate() 造成的。

sherif7810 commented 5 years ago

What is the best training configuration for generalization on test set? This is with batch size of 2: Screenshot from 2019-03-24 15-57-07 Blue: Training. Red: Testing.