Open Sundrops opened 5 years ago
Hi @Sundrops,
1) We used the SCLITE package for CER instead of WER. 2) If my memory serves me, the SCLITE output provides the "correct percentage". That's why 100.0 was subtracted 3) That is correct. Calculating the global CER was too computationally expensive. 4) Please provide more details. Did you use 0_handwriting_ocr.ipynb?
@jonomon Thanks for your reply.
er = match.group(2)
is wrong using latest SCTK. The correct answer should be cer = match.group().split()[-3]
0_handwriting_ocr.ipynb
(your provided model handwriting_line8.params
). But I train model using handwriting_line_recognition.py
and can not get the same result as handwriting_line8.params
. Maybe your comment is wrong.
Thanks for your great work again.The SCLITE was chosen because I believe it accounts capitalisations etc.
@ThomasDelteil could you answer question 4?
I can't reproduce your results(Mean CER: 8.4 obtained by handwriting_line8.params). Can you provide more details about your training? @ThomasDelteil
Same request for the question 4.) I have used the script "handwriting_line_recognition.py" to train the model, but the weights obtained is of only 18mb and the predicted accuracy is not good. Where as the pre-trained weights is around 90mb. could you please provide the model in which you have trained. It would be of great help!
@Sundrops Hi, I have been trying to understand the architecture for this implementation but somehow whenever 1) I run the code even ocr.ipynb or handwritten_line.py, I get stuck after downloading largeWriterIndependentTextLineRecognitionTask.zip: . 2) I wanted to know, is this happening because of the high memory requirement for the entire pipeline or there are any code changes which needs to be incorporated? 3)I am using mxnet-cu82 with cuda 8, if this isn't what's suggested I'll try working on a version 9 for both. Any help would be appreciated.
Thanks in advance.
@NidhiSultan
body = nn.HybridSequential()
with body.name_scope():
# conv1
body.add(gluon.nn.Conv2D(channels=64, kernel_size=(3, 3), padding=(1, 1), strides=(1, 1), use_bias=True))
body.add(nn.Activation('relu'))
body.add(nn.MaxPool2D(pool_size=(2, 2), strides=(2, 2)))
# conv2
body.add(gluon.nn.Conv2D(channels=128, kernel_size=(3, 3), padding=(1, 1), strides=(1, 1), use_bias=True))
body.add(nn.Activation('relu'))
body.add(nn.MaxPool2D(pool_size=(2, 2), strides=(2, 2)))
# conv3_1
body.add(gluon.nn.Conv2D(channels=256, kernel_size=(3, 3), padding=(1, 1), strides=(1, 1), use_bias=False))
body.add(nn.BatchNorm())
body.add(nn.Activation('relu'))
# conv3_2
body.add(gluon.nn.Conv2D(channels=256, kernel_size=(3, 3), padding=(1, 1), strides=(1, 1), use_bias=True))
body.add(nn.Activation('relu'))
body.add(nn.MaxPool2D(pool_size=(2, 2), strides=(2, 1), padding=(0, 1)))
# conv4_1
body.add(gluon.nn.Conv2D(channels=512, kernel_size=(3, 3), padding=(1, 1), strides=(1, 1), use_bias=False))
body.add(nn.BatchNorm())
body.add(nn.Activation('relu'))
# conv4_2
body.add(gluon.nn.Conv2D(channels=512, kernel_size=(3, 3), padding=(1, 1), strides=(1, 1), use_bias=False))
body.add(nn.BatchNorm())
body.add(nn.Activation('relu'))
body.add(nn.MaxPool2D(pool_size=(2, 2), strides=(2, 1), padding=(0, 1)))
# conv5
body.add(gluon.nn.Conv2D(channels=512, kernel_size=(2, 2), padding=(0, 0), strides=(1, 1), use_bias=False))
body.add(nn.BatchNorm())
body.add(nn.Activation('relu'))
body.initialize(mx.init.MSRAPrelu(), ctx=ctx)
@Sundrops, Thanks for the prompt reply. Will try to run it this way as well.
hi @Sundrops @jonomon @NidhiSultan I have an issue in setup this codebase can you specify please which version of python and other dependencies were used to configure this codebase.
Hi @JayeshGridScape, package versions solely depends on MXnet and Cuda version in your system. You'll have to research a bit on these 2 and other packages compatible with them. I used mxnet 8 with cuda 8 though that's outdated now.
Thanks for your great work. I am a rookie on handwriting recognition and have some questions about train and evaluation.
- This repo uses SCLITE for WER evaluation. I found that it will ignore space between words when SCLITE evaluates words of one line. But other mothods such as https://github.com/githubharald/SimpleHTR/blob/master/src/main.py#L81, https://github.com/jpuigcerver/xer/blob/master/xer#L116, are not like this. Which is the criterion in general?
- why
100.0 - float(er)
? I think it'sfloat(er)
for line in output_file.readlines(): match = re.match(match_tar, line.decode('utf-8'), re.M|re.I) if match: # I think there are matching problems number = match.group(1) # --> match.group().split()[4] er = match.group(2) # --> match.group().split()[-3] assert number != None and er != None, "Error in parsing output." return float(number), 100.0 - float(er) # return float(number), float(er)
- It's average cer of all lines, not global cer.
# https://github.com/awslabs/handwritten-text-recognition-for-apache-mxnet/blob/master/0_handwriting_ocr.ipynb def get_qualitative_results_lines(denoise_func): sclite.clear() test_ds_line = IAMDataset("line", train=False) for i in tqdm(range(1, len(test_ds_line))): # .... sclite.add_text([decoded_text], [actual_text]) cer, er = sclite.get_cer() print("Mean CER = {}".format(cer)) return cer
- The pretrained model
handwriting_line8.params
works well. But I can't train such a good model.# https://github.com/awslabs/handwritten-text-recognition-for-apache-mxnet/blob/master/ocr/handwriting_line_recognition.py#L30 # Best results: # python handwriting_line_recognition.py --epochs 251 -n handwriting_line.params -g 0 -l 0.0001 -x 0.1 -y 0.1 -j 0.15 -k 0.15 -p 0.75 -o 2 -a 128
Looking forward to your reply. Thanks a lot.
Thanks !!
@NidhiSultan
If anybody executed it on Google colab ,please share the edited iam_dataset.py it with me , mahinqureship1@gmail.com
i couldn't find the Genbits file in hackerearth server. How can be the Genbits zip file can dowlnload?
Thanks for your great work. I am a rookie on handwriting recognition and have some questions about train and evaluation.
100.0 - float(er)
? I think it'sfloat(er)
handwriting_line8.params
works well. But I can't train such a good model.Looking forward to your reply. Thanks a lot.