alan-turing-institute / misinformation-crawler

Web crawler to collect snapshots of articles to web archive
MIT License
5 stars 2 forks source link

Check sites with low article count #347

Closed edwardchalstrey1 closed 5 years ago

edwardchalstrey1 commented 5 years ago

Check that these sites don't have additional articles we could/should be crawling:

Site Articles Checked
realnewsrightnow.com 84 #348 👍
scrappleface.com 106 👍
youngcons.com 107 Site down temporarily
davidwolfe.com 165 👍
satirewire.com 260 👍
islamicanews.com 272 👍
henrymakow.com 294 👍
clickhole.com 298 #349👍
veteransnewsreport.com 344 👍
usatoday.com 356 👍
weeklyworldnews.com 370 #350 👍
washingtontimes.com 494 👍
eyeopening.info 497 #337 👎
npr.org 513 👍
conservativepapers.com 524 #351 👍
empirenews.net 535 👍
thespoof.com 536 👍
apnews.com 544 #352 👍
valleytimes-news.com 568 👍
theblaze.com 611 👍
abqjournal.com 739 👍
gellerreport.com 805 👍
notallowedto.com 865 👍
vanityfair.com 868 👍
frontpagemag.com 901 👍
threepercenternation.com 968 👍