Norconex / collector-core

Collector-related code shared between different collector implementations
http://www.norconex.com/collectors/collector-core/
Apache License 2.0
7 stars 15 forks source link

Remember Bad Status and Error to use Grace Once #31

Closed kemcon closed 3 years ago

kemcon commented 3 years ago

without this changes, grance once is not available, especially for errors. example: Website A has a link to B. The Website crashes and the SpoiledReferenceStrategy is GRACE_ONCE. 200 ok crawl: A and B are committed First error crawl, the cache is filled with A and B: A has an error and B has an error. Second error crawl: without the change, the cache is empty, but:

With this fix, the AbstractCrawler line 694 is not nessessary anymore, but its also not wrong.