sntran / scrapex

An Elixir open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way. Or just an experiment writing a scraper in Elixir to scratch my own itch. Use at your own risk.
https://sntran.github.io/scrapex/doc/
MIT License
17 stars 2 forks source link

Avoid re-scrape same URL in one run #8

Open sntran opened 8 years ago

sntran commented 8 years ago

One run can have duplicate URLs (a product page can belong in multiple categories). Need to avoid re-scraping them.