CivicActions / edscrapers

US Department of Education Data Scraping Kit; see https://us-ed-scraping.ckan.io/dataset
GNU Affero General Public License v3.0
15 stars 9 forks source link

[ocr-parser][l]: parser for ocr #29

Closed osahon-okungbowa closed 4 years ago

osahon-okungbowa commented 4 years ago

ocr-parser complete. Completed the following:

nightsh commented 4 years ago

I have removed the old usage of dataset model, since as of yesterday we started using the scrapy pipeline. So no more dump() calls.

Here's a summary of my changes:

I am going to run the code from this branch now (i.e. not merge yet), so you can have a look at the diff and maybe have suggestions.