mysociety / za-hansard

A parser for South African Hansards, as published at http://www.parliament.gov.za/live/content.php?Category_ID=119
Other
2 stars 3 forks source link

PMG scraper potentially misses unprocessed reports #39

Open geoffkilpin opened 10 years ago

geoffkilpin commented 10 years ago

The PMG scraper currently stops processing reports from a committee when it reaches a previously seen and processed report, meaning that any older unprocessed reports are not processed.

(An unprocessed report is one which has no appearances associated with it - relevant as PMG will often post meeting documentation before the actual report. e.g. http://www.pmg.org.za/report/20140206-eastern-cape-provincial-department-human-settlements-progress-made-oversight-visit-findings-and-recommendations at the time of writing).

After checking for new reports, the scraper should run through all known unprocessed reports and attempt to process them.