microsoft / CognitiveServicesLanguageUtilities

Utilities for the Cognitive Service Custom Text document processing tool.
MIT License
18 stars 1 forks source link

azure function timeout -> indexer fails #180

Open mshaban-msft opened 3 years ago

mshaban-msft commented 3 years ago

scenario: after running the index command, the indexer fails in cognitive search web portal citing azure function timeout (i.e. request to custom skillset took longer than some time)

proposed solution:

  1. on the indexer side
    • increase timeout in indexer schema (to what extent?!)
    • or limit the number of documents passed in a single request
  2. on the azure function side
    • set a timeout on each request, and return a valid 'failed' response
nawanas commented 3 years ago

what are the limits of the azure function? is it related to the size/number of documents or this is a transit issue,

if it is related to the size/number of documents you can limit the batch size and run several batches to complete a document set if it is a transit issue, you could repeat te azure function three times and then declare failure for this set of docs

mshaban-msft commented 3 years ago

what are the limits of the azure function? is it related to the size/number of documents or this is a transit issue,

if it is related to the size/number of documents you can limit the batch size and run several batches to complete a document set if it is a transit issue, you could repeat te azure function three times and then declare failure for this set of docs

[updated] actually the error message was "it's a transient issue, please try again later" and it actually worked some time after

but as i said, we need to have a more robust experience customers won't go through a debug session to find out about this as of now, we don't know how many documents the indexer sends per request to the azure function we need to modify both timeout limit and the number of documents per request

nawanas commented 3 years ago

there will always be a transit that would be beyond the timeout, so that is not a solution. You could do a timeout that is relative to the batch size, but you need to identify the response time relative to the number/size of the documents (you could try a few different sizes with increments to understand how it behaves),

if you are getting this error, the user should be able to send the batch to the service again, and it should be expressed in that light