surfedushare / pol-harvester

A repository that harvests different sources for content
2 stars 0 forks source link

Implement OAI-PMH #128

Closed jelmerderonde closed 4 years ago

jelmerderonde commented 4 years ago

Implement OAI-PMH to harvest EduRep. EduRep has requested us to take make sure that we:

jelmerderonde commented 4 years ago

EduRep has added our test and prod machines to their whitelist

fako commented 4 years ago

The first requirement to only harvest once and let different environments use it can be achieved through a few datagrowth commands namely: dump_resource and load_resource.

If we pass the EdurepOAIPMH resources to these commands they'll dump/load the resource from disk in such a way that it can write/read GB's of data without performance issues. So completely feasible to share only the Edurep OAIPMH data and let all environments handle that data in different ways (for testing purposes mainly).

Both commands could write directly to block storage or we can setup some kind of rsync flow to share the dumps across servers.

fako commented 4 years ago

These things are still open todo's for this ticket. We may want to split these into their own ticket at some point, but wanted to have a listing to begin with