xdebbie / forkkit

Web crawler to mine album review scores and metadata from pitchfork.com
MIT License
5 stars 3 forks source link

Avoid duplicates when scraping to add new reviews #2

Closed xdebbie closed 4 years ago

xdebbie commented 4 years ago

Fix the script to parse all existing entries to make sure the new parsed data is not a duplicate. If a duplicate, skip and do not register as a new db entry.

xdebbie commented 4 years ago

SQL code successfully added to the Python scraper to eliminate duplicates