Nov 28, 2022 9:48:29 AM org.archive.modules.CrawlURI getPolitenessDelay
WARNING: politessDelay unset, returning default 5000 for https://www.english.op.org/robots.txt (in thread 'ToeThread #47: https://www.english.op.org/robots.txt')
Nov 28, 2022 9:48:35 AM org.archive.crawler.framework.ToeThread recoverableProblem
SEVERE: Problem java.lang.IllegalArgumentException: Comparison method violates its general contract! occurred when trying to process 'https://www.english.op.org/robots.txt' at step ABOUT_TO_BEGIN_PROCESSOR in
(in thread 'ToeThread #498: https://www.english.op.org/robots.txt')
java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeHi(TimSort.java:899)
at java.util.TimSort.mergeAt(TimSort.java:516)
at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
at java.util.TimSort.sort(TimSort.java:254)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1464)
at java.util.Collections.sort(Collections.java:177)
at org.apache.http.impl.cookie.RFC6265CookieSpec.formatCookies(RFC6265CookieSpec.java:217)
at org.apache.http.client.protocol.RequestAddCookies.process(RequestAddCookies.java:187)
at org.apache.http.protocol.ImmutableHttpProcessor.process(ImmutableHttpProcessor.java:133)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at org.archive.modules.fetcher.FetchHTTPRequest.execute(FetchHTTPRequest.java:823)
at org.archive.modules.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:679)
at org.archive.modules.Processor.innerProcessResult(Processor.java:175)
at org.archive.modules.Processor.process(Processor.java:142)
at org.archive.modules.ProcessorChain.process(ProcessorChain.java:131)
at org.archive.crawler.framework.ToeThread.run(ToeThread.java:147)
...the content (as seen in my web browser) appears to be:
From DC
...the content (as seen in my web browser) appears to be: