awslabs / handwritten-text-recognition-for-apache-mxnet

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.
Apache License 2.0
481 stars 189 forks source link

Doubts from the presentation and looking ahead #55

Closed sambbhavgarg closed 3 years ago

sambbhavgarg commented 3 years ago

Hi @jonomon I had a couple of questions while I was going through the presentation for this particular project on GitHub.

  1. Would word beam search (https://repositum.tuwien.at/retrieve/1835) a modified version of the Vanilla Beam Search work better with this? Is it worth it to integrate WBS in this project for a small improvement?
  2. In the end, the presenter mentioned the algorithm to be the State of the Art in HTR. I wanted to ask if this is still SOTA? Has there been any significant improvement in HTR? The presenter mentioned a modified CTC where the probabilities can be combined across routes sequence combinations during inference. Could you perhaps add a little about this in the end in the README?

Thanks, Sambbhav

jonomon commented 3 years ago

Hi Sambbhav,

Thanks for your interest.

  1. We’ve experimented with beam search before but we didn’t think was worth the extra computational costs. If you want something more accurate, you could explore it.
  2. This is by no means SOTA anymore. There are many new algorithms out there now. To be updated, you can search for papers citing our paper https://arxiv.org/pdf/1910.00663.pdf.

Regards, Jonathan

On Sat, Feb 6, 2021 at 4:04 AM, Sambbhav Garg notifications@github.com wrote:

Hi @jonomon https://github.com/jonomon I had a couple of questions while I was going through the presentation for this particular project on GitHub.

  1. Would word beam search (https://repositum.tuwien.at/retrieve/1835) a modified version of the Vanilla Beam Search work better with this? Is it worth it to integrate WBS in this project for a small improvement?
  2. In the end, the presenter mentioned the algorithm to be the State of the Art in HTR. I wanted to ask if this is still SOTA? Has there been any significant improvement in HTR? The presenter mentioned a modified CTC where the probabilities can be combined across routes during inference. Could you perhaps add a little about this in the end in the README?

Thanks, Sambbhav

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/awslabs/handwritten-text-recognition-for-apache-mxnet/issues/55, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALQP6KQYHFOS3OCI4CUUELS5UV4LANCNFSM4XGGZTWQ .

sambbhavgarg commented 3 years ago

Thanks for your response.

Yes, I do want something more precise, apparently. Found a couple of helpful references in your paper. One of the citations claims to be the current SOTA as well. Thanks for pointing me in the right direction, Jonathan.

Regards, Sambbhav