swissbib / contentCollector

used to collect the whole content provided by swissbib
3 stars 0 forks source link

contentCollector is used by swissbib to collect all the data processed and provided by swissbib.

The procedure of "Collection" is arbitrary and can be done through various channels

Currenly supported are:

1) OAI-Harvesting 2) Push - mechanism via file interface 3) Pull mechanism via WebDav

The channel for OAI is based on the Python library for OAI clients (https://pypi.python.org/pypi/pyoai)

The main use case for swissbib:

This main use case could be enhanced via PlugIns

swissbib developed a plugin to store the pre-processed raw content into a data store. (we are using a Mongo database for this purpose). This gives us the flexibility to access at any time the raw content of the repositories for our purposes. Another possibility could be to handover the raw content into a message queue (like RabbitMQ or Apache ActiveMQ) to push the content reliable into further channels (beside the swissbib CBS data hub)

Todo: