Norconex / crawlers

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
https://opensource.norconex.com/crawlers
Apache License 2.0
183 stars 68 forks source link

DeleteTagger creating crawl errors #11

Closed kalhomoud closed 11 years ago

kalhomoud commented 11 years ago

Hello,

With the following setup in the configuration file:

. . .

I get the following errors while the URLs are being crawled: . . . . ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url ERROR [HttpCrawler] Could not process document: http://some/url (null) INFO [CrawlStatus] ERROR > (1) http://some/url . . . .

I can provide you with the my config file if needed.

Thanks, Khalid

essiembre commented 11 years ago

Fixed. In latest snapshot release. Closing. Please re-open if the issue is still there for you.