Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
With the following setup in the configuration file:
.
.
.
I get the following errors while the URLs are being crawled:
.
.
.
.
ERROR [HttpCrawler] Could not process document:
http://some/url (null)
INFO [CrawlStatus] ERROR > (1)
http://some/url
ERROR [HttpCrawler] Could not process document:
http://some/url (null)
INFO [CrawlStatus] ERROR > (1)
http://some/url
ERROR [HttpCrawler] Could not process document:
http://some/url (null)
INFO [CrawlStatus] ERROR > (1)
http://some/url
ERROR [HttpCrawler] Could not process document:
http://some/url (null)
INFO [CrawlStatus] ERROR > (1)
http://some/url
ERROR [HttpCrawler] Could not process document:
http://some/url (null)
INFO [CrawlStatus] ERROR > (1)
http://some/url
.
.
.
.
I can provide you with the my config file if needed.
Hello,
With the following setup in the configuration file:
I get the following errors while the URLs are being crawled: . . . . ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url . . . .
I can provide you with the my config file if needed.
Thanks, Khalid