githubharald / CTCWordBeamSearch

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.
https://towardsdatascience.com/b051d28f3d2e
MIT License
557 stars 160 forks source link

Implementing Word beam search correctly #73

Closed HaniaGhouse0407 closed 9 months ago

HaniaGhouse0407 commented 10 months ago

Hello @githubharald , I am currently working on a project focused on handwritten prescription recognition using a CRNN (Convolutional Recurrent Neural Network) model. My objective is to implement a Word Beam Search decoder for more accurate inference.

I have encountered issues with eager tensors and attempted alternative implementations without achieving the desired results. As a beginner, I am seeking guidance on refining my code and improving the overall accuracy of the model. The project aims to achieve high accuracy in recognizing handwritten doctor's prescriptions.

I would appreciate it if you could review my code, which is accessible here: https://colab.research.google.com/drive/1Ft0KVJJhulQWbRv26F2qTQfjQYFp-1LY#scrollTo=-DokrqTe5uaU , and provide insights into resolving the eager tensor issues. Additionally, any suggestions, resources, or methods to enhance the accuracy of the model would be really appreciated.

Thank you for your time

githubharald commented 10 months ago

Hi, the word beam search decoder is implemented in C++ and expects numpy arrays to get numpy arrays as input. Whatever deep learning framework you use - you should always be able to convert tensors to numpy arrays.