Closed netsensei closed 5 years ago
This statement still holds true:
We should be weary though "reinventing the wheel" here as Catmandu already does a lot of the heavy lifting out of the box.
It's extremely hard to build a generic tool. Let's ruthlessly guard the scope here. The tool should work within the context of the Flemish Art Context. So, only add things that can't be done with a vanilla Catmandu.
The main goal of the tool is to quickly set up and maintain robust ETL pipelines within an existing infrastructure. Only add new importer & exporter modules when new applications ar added to the infrastructure.
Use open formats and protocols as best as you can. Treat OAI-PMH as a first class citizen + avoid implementing custom API's of systems in separate modules as much as possible.
With this pull request in the works, we can now flesh out / isolate different functions in separate command classes.
Question: what should does tool do anyway? Answer: it's a "glue tool", which means it brings different Catmandu modules together so you don't have to write boilerplate bash or perl to set up a conveyor belt.
However, there are still discrete business requirements to be met. I can identify these variations of input / output. I can put them into 2 categories.
Push data to a datahub instance.
Export data to a local flat file
We should be weary though "reinventing the wheel" here as Catmandu already does a lot of the heavy lifting out of the box.