Closed anjackson closed 6 years ago
There is already a clamdTimeout
setting to control this, but it defaults to 0 (no timeout). Therefore this is a crawler beans configuration issue.
Running this Groovy script in a H3 console works, but the change needs deploying to the configuration files.
rawOut.println(appCtx.getBean("viralContent").getClamdTimeout())
appCtx.getBean("viralContent").setClamdTimeout(60*1000)
rawOut.println(appCtx.getBean("viralContent").getClamdTimeout())
Okay, config updated.
We've seen an issue in production with hanging socket connections interfering with crawl ops.
We should check that the
ViralContentProcessor
will time out after some reasonable time (a few mins).