smacademic / project-bdf

project-bdf created by GitHub Classroom
2 stars 0 forks source link

There is no timeout for transcriptions #74

Open afig opened 5 years ago

afig commented 5 years ago

Describe the bug In some cases, a transcription may more time than we might want to allow. Although this does limit the capabilities of the bot, it will help prevent the bot being locked up by having a complex request sent to it.

To Reproduce Steps to reproduce the behavior:

  1. Have the bot transcribe an image with many characters, such as this one
  2. In some installations, the bot takes up to 2 minutes to transcribe the image. This makes it seem like the bot has stalled, and prevents it from processing other images.

Expected behavior The bot should stop attempting to transcribe after a certain amount of time, such as 15 seconds.

Screenshots/Logs N/A

Project information Noticed Start of M6 (stress testing). Existed since the bot was able to transcribe images

Additional context The bot will need to reply with an appropriate message when a transcription timeouts.

afig commented 5 years ago

About an hour and a half of research reveals that there is no easy and cross-platform way performing this. Most solutions online (like these) only work on Unix systems. Most modules that offer timeouts (such as timeout-decorator) use similar methods that also only work on Unix systems.

One potential cross-platform solution involves having the function call split off into a separate process. However, this is rather complicated and error prone. However, if we really wish to opt for this, some related discussion on this workaround can be found here.

Our best option might be to wait and see if pytesseract will implement timeouts natively. There is already an open issue on this in the pytesseract GitHub repo.

I propose moving this out of M6 and into the Icebox for now.

calebeda commented 5 years ago

I am okay with moving this into the Icebox for now