Depending on the Source of the URL (RSSFeed or Sitemap) Publishers tend to sometimes add a parameter to the url indicating the origin. E.g. Kicker appends #rssom. If max_articles is set to a large enough value, this may lead to the same article being crawled twice, since the response_cache distinguishes the two URLs
Depending on the Source of the URL (RSSFeed or Sitemap) Publishers tend to sometimes add a parameter to the url indicating the origin. E.g. Kicker appends #rssom. If max_articles is set to a large enough value, this may lead to the same article being crawled twice, since the response_cache distinguishes the two URLs