githubharald / SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow.
https://towardsdatascience.com/2326a3487cd5
MIT License
1.99k stars 894 forks source link

Text Segmentation #80

Closed kevinkmldn closed 5 years ago

kevinkmldn commented 5 years ago
  1. Versions

    • TensorFlow version : 1.13.1
    • Python version : 3.6
    • Operating system : Windows
  2. Issue

    • Which result / error did you get? My result is my image didnt get augmented properly, i already set dataAugmentation on preprocess and infer parameter to True

def preprocess(img, imgSize, dataAugmentation=True): if img is None: img = np.zeros([imgSize[1], imgSize[0]]) if dataAugmentation: stretch = (random.random() - 0.5) # -0.5 .. +0.5 wStretched = max(int(img.shape[1] * (1 + stretch)), 1) # random width, but at least 1 img = cv2.resize(img, (wStretched, img.shape[0])) # stretch horizontally by factor 0.5 .. 1.5

test0

*Provide All Datas My input image is like this

aja

Sorry for my previous post didnt fill the issues template, Thank you so much!

githubharald commented 5 years ago

Hi,

the word will NOT be segmented by SimpleHTR. Data augmentation is something different. If you want to crop out the text boxes, you could use some text detector first and then apply SimpleHTR for reading.

kevinkmldn commented 5 years ago

Thank you for your explanation @githubharald ! can you tell me what is the uses of Data augmentation in the code? and is your WordSegmentation able to do that or do you have any other references? thanks in advance.

githubharald commented 5 years ago

for word segmentation / text detection, you could try EAST. But you most likely will have to train it on your data. There is no out-of-the-box solution I'm aware of.