ukwa / ukwa-heritrix

The UKWA Heritrix3 custom modules and Docker builder.
9 stars 7 forks source link

Thread contention in uk.bl.wap.modules.deciderules.CompressibilityDecideRule #4

Open anjackson opened 7 years ago

anjackson commented 7 years ago

Seeing 1/4 of all threads blocked waiting to get hold of a single java.util.zip.Deflater instance...

Blocked/Waiting On: java.util.zip.Deflater@11476000 which is owned by ToeThread #69: http://luterano.blogspot.co.uk/2006/09/el-salvadors-holocaust-hero.html(136)
    uk.bl.wap.modules.deciderules.CompressibilityDecideRule.evaluate(CompressibilityDecideRule.java:66)
    org.archive.modules.deciderules.PredicatedDecideRule.innerDecide(PredicatedDecideRule.java:47)
    org.archive.modules.deciderules.DecideRule.decisionFor(DecideRule.java:60)
    org.archive.modules.deciderules.DecideRuleSequence.innerDecide(DecideRuleSequence.java:113)
    org.archive.modules.deciderules.DecideRule.decisionFor(DecideRule.java:60)
    org.archive.crawler.framework.Scoper.isInScope(Scoper.java:107)
    org.archive.crawler.prefetch.CandidateScoper.innerProcessResult(CandidateScoper.java:40)
    org.archive.modules.Processor.process(Processor.java:142)
    org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
    org.archive.crawler.postprocessor.CandidatesProcessor.runCandidateChain(CandidatesProcessor.java:176)
    org.archive.crawler.postprocessor.CandidatesProcessor.innerProcess(CandidatesProcessor.java:230)
    org.archive.modules.Processor.innerProcessResult(Processor.java:175)
    org.archive.modules.Processor.process(Processor.java:142)
    org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
    org.archive.crawler.framework.ToeThread.run(ToeThread.java:152)

i.e. this compressibility rule needs to be threadsafe and likely use a threadlocal Deflater instance, as it seems the scope bean is shared across threads.

anjackson commented 7 years ago

Should be fixed by ecb12223a541e8c0928e0e6c5e49a1590257b9c4 but testing needed.