githubharald / CTCWordBeamSearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.
https://towardsdatascience.com/b051d28f3d2e
MIT License
557 stars 160 forks source link

integrating (CTCWordBeamSearch)PureNumpy with (SimpleHTR --wordbeamsearch) #52

Closed jjsr closed 3 years ago

jjsr commented 3 years ago

Dear Sir, Thank you for making it compatible along with windows , Your kind suggestion is required to implement it in SimpleHTR code --wordbeamsearch option-- 1) I followed the steps of read me at successful upto getting the output same as testPybind.py(Since windows does not support custom tf operation as per read me i followed purenumpy instruction)

Code snippet of SimpleHTR-->Model.py ################################## word_beam_search_module = tf.load_op_library('TFWordBeamSearch.so') chars = str().join(self.charList) wordChars = open('../model/wordCharList.txt').read().splitlines()[0] corpus = open('../data/corpus.txt').read()

decode using the "Words" mode of word beam search

self.decoder =word_beam_search_module.word_beam_search(tf.nn.softmax(self.ctcIn3dTBC, dim=2), 50, 'Words', 0.0, corpus.encode('utf8'), chars.encode('utf8'), wordChars.encode('utf8'))

################################# What should I replace it with to integrate CTCWordBeamSearch with SimpleHTR

Note- Initially i have tried to replace it with WordBeamSearch() , got confused with what to feed at my second try I try to convert tf.nn.softmax(self.ctcIn3dTBC, dim=2) into np.array(tf.nn.softmax(self.ctcIn3dTBC, dim=2)) results an error ** called testPybind() with above parameters results in error ..,, Please may guide to which direction go next

githubharald commented 3 years ago

you have to evaluate the output of the RNN with softmax applied, and then feed this into the decoder. Something like that (not tested):

jjsr commented 3 years ago

Step-1 Initialize above in init of Model.py chars = str().join(self.charList) wordChars = open('../model/wordCharList.txt').read().splitlines()[0] corpus = open('../data/corpus.txt').read() self.wbs=WordBeamSearch(25,'Words',0.0,corpus.encode('utf8'), chars.encode('utf8'), wordChars.encode('utf8'))

Step-2 in Model.py--> setupCTC() defined Point-1 Above self.rnn_out_sm = tf.nn.softmax(self.ctcIn3dTBC, axis=2)

Sidenote- I am running code as(python main.py --wordbeamsearch) (using presaved model ,testing on a single image) Kindly let me know what i am doing wrong if possible, i know i am messing things up ,a word of help still highly appreciated I am getting the error on simply running the code after making changes in model,py -- int' object is not iterable -- on Line 196

Modified model.py I have uploaded on my git just for time being after that i will remove , I have also given your reference while uploading model.py

jjsr commented 3 years ago

Thanks again for your earlier response sir

githubharald commented 3 years ago

Hi,

at first sight code looks OK. I guess this line is wrong and can be removed (you want to pass all batch elements to the function decoderOutputToText and not just the one with index 0): https://github.com/jjsr/SimpleHtr/blob/main/Model.py#L273

If this does not fix the problem: can you upload all of your code into your repository? So that I just have to download it and run it to reprocude the error.

jjsr commented 3 years ago

Sir ,After your input I am able to somehow run it on "Word" , I am not quite confident in what patches i have made during the process is right or Its really worked , Beside it Sir may you please suggest into line segmentation(Breaking a page into lines ) approach ? (The approach you think would work best on IAM(Not quite complex scenario) with preferable code implementation link). Additionally I will upload the codes of the patches i try to work out , Kindly go through it sir once uploaded . Thanks again for your timely valuable feedbacks

githubharald commented 3 years ago

you can try WordDetectorNN to detect handwritten words on a scanned page.

githubharald commented 3 years ago

FYI: SimpleHTR is now using the Python package (numpy operation) instead of the TF operation.

jjsr commented 3 years ago

Thank you so much sir I was in the dire need of same ... Thanks again .. Regards, Vaibhav

On Fri, Feb 19, 2021, 01:59 Harald Scheidl notifications@github.com wrote:

FYI: SimpleHTR is now using the Python package (numpy operation) instead of the TF operation.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/githubharald/CTCWordBeamSearch/issues/52#issuecomment-781615680, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXE74G3P5LLSOWG4M76AKDS7V2D5ANCNFSM4XBA62RA .