CivicActions / edscrapers

US Department of Education Data Scraping Kit; see https://us-ed-scraping.ckan.io/dataset
GNU Affero General Public License v3.0
15 stars 9 forks source link

Scrape and harvest collections / sources - PROTOTYPE [P1(OCR)] #115

Closed nightsh closed 4 years ago

nightsh commented 4 years ago

Depends on #114

Harvesting collections and sources depends on the schema validation allowing groups of different types and package relationships in the data.json source files.

Proposed Spec For Implementation is located here: Specs For Implementing data.json Validation Schema for Dept of Ed

Format

Scraping rules:

CKAN extensions updates:

The datajson extension needs Collection / Source processing capabilities based on the data it finds in the data.json file.

Tasks:

Acceptance criteria: