openaire / iis

Information Inference Service of the OpenAIRE system
Apache License 2.0
20 stars 11 forks source link

Consider increasing or disabling connection request timeout for http clients #1233

Open przemyslawjacewicz opened 3 years ago

przemyslawjacewicz commented 3 years ago

Running patent metadata retriever and cached webcrawler jobs with empty cache resulted in many faults with org.apache.http.conn.ConnectionPoolTimeoutException. This exception is thrown when the waiting time for a connection from connection pool exceeds the timeout. In HttpClientUtils we create CloseableHttpClient and use the same value for connect timeout and connection request timeout. We could check if we can increase or maybe disable the timeout for connection from the pool and bring the number of faults down.

We should check separately for patent metadata retrieval and http content retrieval.

przemyslawjacewicz commented 3 years ago

Issue with the description of results of running patent metadata retriever job is #1136 .