Closed jjsr closed 3 years ago
The model from the thesis was trained on line images. But it should also work with word images, so I guess there is something wrong in the implementation.
Thanks for writing back . I am always amazed the needful guidance provide by you sir . I will try me best to figure out the issue with my model . Thanks . I will ask your permission to keep this issue open for a couple of days .. thanks again
Hi sorry for the getting back in the late in the thread , I am seriously want to try to solve the issue I was facing by myself so that why I am little bit hesitant to ask sorry sir for troubling you, Main Objective - To train model on line level . Steps followed -1) Image size 128*32 is not sufficient cause its only able to predict upto 32 characters at max .As per your suggestion and following the table I converted the code of setupCNN to iterative one able to successfully upto the steps - (100,8,512) of the last row of table . Now this output needs to be fed into LSTM(Dont know how to apply MDLSTM ) so i must get rid of (100,8,512) to (100,1,512) , I am stuck at this step I have tried the following approaches-
Approach-1) by looking into looped code (128,32)---(32) that 2 power 5 ---> 5 CNN layers , hence for (800,64)-->64 -- 2 power 6 - so at each layer (conv2d, convolutional_normal,relu,pool) for 6 layers I tried to run this model but not able to get any word accuracy even after a day of running.
Approach-2) I followed the table upto step (100,8,512) then i add pair of CONV and Pool to reduce it upto (100,1,512) but still my model does not produce anything .. I am stuck at this point(100,8,512) to (100,1,512) ,May i send the code at your mail instead of posting it here ? Kindly please let me know the direction . Thanking in advance
Note- In createLMDB i have to increase size from 2 to 4 GB and modified the DataloaderCode little bit to work at line of images
Thank you so much sir .
Hello sir , I have implemented the line level code by the guidance given by you sir , sir may you please let me know , how have you find out the CER and WordAccuracy in line level of your thesis. Pseudo algorithm will be greatly help sir.
numCharErr = 0 numCharTotal = 0 numWordOK = 0 numWordTotal = 0 while loader.hasNext(): iterInfo = loader.getIteratorInfo() print(f'Batch: {iterInfo[0]} / {iterInfo[1]}') batch = loader.getNext() (recognized, _) = model.inferBatch(batch)
` print('Ground truth -> Recognized')
for i in range(len(recognized)):
numWordOK += 1 if batch.gtTexts[i] == recognized[i] else 0
numWordTotal += 1
dist = editdistance.eval(recognized[i], batch.gtTexts[i])
numCharErr += dist
numCharTotal += len(batch.gtTexts[i])
print('[OK]' if dist == 0 else '[ERR:%d]' % dist, '"' + batch.gtTexts[i] + '"', '->',
'"' + recognized[i] + '"')
# print validation result
charErrorRate = numCharErr / numCharTotal
wordAccuracy = numWordOK / numWordTotal
print(f'Character error rate: {charErrorRate * 100.0}%. Word accuracy: {wordAccuracy * 100.0}%.')
return charErrorRate, wordAccuracy
`
What needs to be changed in this part ,, Thanks in advance || I have tried but getting the out of index error ... If its fine i may post how i am approaching the CER and Word Accuracy ??
closing because this is not an issue with the SimpleHTR repo (sorry, no time to provide personal guidance with code).
Hello sir , Thanks for sharing the repo with us . may you please let me know , that you trained on word images or line images (of IAM) if Model size is 800*64(Used in your thesis )(FAQ article)?? This would be great help. I tried to train on word images even after 24 Hours Word Accuracy is 1% so i thought i should try to train over line images. Kindly guide