DistrictDataLabs / baleen

An automated ingestion service for blogs to construct a corpus for NLP research.
MIT License
86 stars 38 forks source link

NotUniqueError caused by downloading non-changed feed content #52 #66

Closed olgert closed 8 years ago

olgert commented 8 years ago

The idea is to store hash of feed XML, and compare new one to previously stored.

coveralls commented 8 years ago

Coverage Status

Coverage decreased (-4.4%) to 60.042% when pulling 41e31b2b890c3a357f6a9bee878fac0f3b714377 on olgert:develop into 6d3649569e691e072c20f427ec80457cbb7bfbe2 on bbengfort:develop.

coveralls commented 8 years ago

Coverage Status

Coverage decreased (-4.4%) to 60.042% when pulling a259e35bfad6669c7b274bd3f5bbbef6e7c75b13 on olgert:develop into 6d3649569e691e072c20f427ec80457cbb7bfbe2 on bbengfort:develop.