Increase number of request retries when evaluting a tool

tschaffter commented 3 years ago

Is your proposal related to a problem?

Yes. The current exponential backoff strategy only retries 3 times when a request sent to a tool fails. These 3 attempts happen 2, 4, and 8 second later, thus taking 14 seconds. This amount of time is not sufficient in case a tool gets temporarily "stuck".

Most importantly, one problem with the current implementation is that the controller starts sending requests to tools even without being sure that a tool is fully initialized. One way to fix cleaning this issue would be to add an a dedicated endpoint to the tools to interrogate them on their initialization status.

Meanwhile, the exponential backoff solution can provide a solution to this issue.

Describe the solution you'd like

Increase the exponential backoff strategy so that it covers a period of 3 minutes, which should be sufficient for most tool to get fully initialized. Our Spark NLP tools already takes about 1.5 minutes to initialize.

Describe alternatives you've considered

Add a new endpoint to tools, see above

Additional context

Is the request retried if the tool has not responded yet? What if the tool takes more than 2 seconds to respond, is a new request sent or the controller always wait on receiving an error response before retrying to send the request?

tschaffter commented 3 years ago

Tagging @gkowalski

thomasyu888 commented 3 years ago

@tschaffter what errors are people getting to warrant this? The errors @gkowalski are getting have nothing to do with the service not being started.

tschaffter commented 3 years ago

@thomasyu888

The seemingly random error that requires people to resubmit
I will submit soon a tool that takes 30-60 sec to initialize
This change should not affect the system in any negative way. Retries are only done when a request fails.

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

nlpsandbox / nlpsandbox-client