alan-turing-institute / misinformation-crawler

Web crawler to collect snapshots of articles to web archive
MIT License
5 stars 2 forks source link

update byline xpath nationalreview #340

Closed edwardchalstrey1 closed 5 years ago

edwardchalstrey1 commented 5 years ago

Closes #339

2019-07-30 11:17:12     INFO: Processed 2479 pages in 0:15:34.717161 => 2.65 Hz
2019-07-30 11:17:12     INFO: Found articles in 2479/2479 pages => 100.00%
2019-07-30 11:17:12     INFO: ... of these 0/2479 had no date => 0.00%
2019-07-30 11:17:12     INFO: ... of these 0/2479 had no byline => 0.00%
2019-07-30 11:17:12     INFO: ... of these 0/2479 had no title => 0.00%
2019-07-30 11:17:12     INFO: Including skipped pages, there are articles in 2479/2479 pages => 100.00%