-
-
Following [this report](https://groups.yahoo.com/neo/groups/archive-crawler/conversations/topics/8952;_ylc=X3oDMTM0b2wwNG0wBF9TAzk3MzU5NzE0BGdycElkAzg3NTk4NjcEZ3Jwc3BJZAMxNzA1MDA0OTI0BG1zZ0lkAzg5NTQEc…
-
I've trying to crawl a HTTPS site through a Squid proxy and keep seeing errors like these:
```
java.io.IOException: RIS already open for ToeThread #12: https://XXX/robots.txt
at org.archive.io.Rec…
-
Presumably due to some mis-configuring of Heritrix at the Royal Danish Library, we have a non-trivial amount of truncated records in some of our WARCs. Some of these are silently truncated, i.e. no ex…
-
These are some notes I made on issues raised during the workshop at the IIPC conference:
- The title field too short - 50 chars is not enough.
- 'URLs prefix" scope is wrong/misleading, because th…
-
By Laura Waldoch, CUL:
We’ve noticed a data error under the Theme “Cambridge Network”, the following record:
Accelrys
Archived date: 2012-11-21 http://www.accelrys.com/
Clickin…
-
-
This should help with eventually getting the application code signed and available at the official Mac outlet.
https://developer.apple.com/library/mac/technotes/tn2206/_index.html#//apple_ref/doc/uid…
-
-