codelibs / elasticsearch-river-web

Web Crawler for Elasticsearch
Apache License 2.0
234 stars 57 forks source link

Ignoring already stored URL's #102

Open Choumy opened 9 years ago

Choumy commented 9 years ago

without inserting then deleting, it just ignores url if already indexed. Any comments are welcome ..

marevol commented 9 years ago

Crawled data is not unique by URL.