eragonruan / text-detection-ctpn

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network
MIT License
3.43k stars 1.33k forks source link

how to get mlt_english+chinese dataset? #74

Open Xiangyaojun opened 6 years ago

Xiangyaojun commented 6 years ago

I want to modify ''ANCHOR_SCALES'' to train chinese dataset, but I can not find the link of mltenglish+chinese dataset.Could you help me ?Thanks.^^

eragonruan commented 6 years ago

@Xiangyaojun check this link. this dataset contains multi-lingual scene text

Xiangyaojun commented 6 years ago

@eragonruan hi, thanks for your link, it's very helpful. I try to modify ''ANCHOR_SCALES'' to [2,4,8,16], but I find "generate_anchors" function in "generateanchors.py" still generate 10 anchors not 80 anchors. This makes me very confused. I hope you can help me to understand.Thanks.^^

This 10 anchors' shape are "[(11, 16), (16, 16), (23, 16), (33, 16), (48, 16), (68, 16), (97, 16), (139, 16), (198, 16), (283, 16)]"

eragonruan commented 6 years ago

@Xiangyaojun anchor width is fixed to 16 according to the algorithm.(in conv5_3, each pixel in the feature map represents a 16x16 area of the original image). However, the height of the anchor is modifiable.

yingwei13mei commented 6 years ago

@Xiangyaojun could you share your dataset with Baidu Yun, please?