edgov pipeline changed to also cover all the offices present in ed.gov
oese, oela, ope, opepd, octae, osers are now submodules of edgov and can still be called individually e.g. eds scrape edgov.oese
the command eds scrape edgov will cover all the contents of ed.gov website, this including all the sub-pipelines mentioned above
the only reason the offices are still individual pipelines is to be able to individually turn them off/on when we start stopping Airflow pipelines (so we stop edgov and only leave the individual offices to be toggled as needed)
all the harvesters except edgov will need to be stopped once the edgov pipeline will start producing data.json files
new Publisher model supporting suborganizations according to dcat specs
backwards compatibility with existing publisher implementation
Changes in this branch:
edgov
pipeline changed to also cover all the offices present in ed.govoese
,oela
,ope
,opepd
,octae
,osers
are now submodules ofedgov
and can still be called individually e.g.eds scrape edgov.oese
eds scrape edgov
will cover all the contents of ed.gov website, this including all the sub-pipelines mentioned aboveedgov
will need to be stopped once theedgov
pipeline will start producing data.json filesPublisher
model supporting suborganizations according to dcat specs