Open motey opened 4 years ago
Some ideas off the top of my head:
* use Neo4J to keep track of dependencies / order the source systems (currently, motherlode determines this itself, but since we require a running Neo4J instance anyway, we can just as well use its graph algorithms for tracking this info)
Had the same idea. but this would make bootstrapping motherlode harder. Also wiping the database and refill it via motherlode will not be possible.
On the other hand having the information which datasources are loaded (and even the possiblity connect data to its datasource) is pretty compelling. Maybe a hybrid approach would be one good solution. This could be achieved by extending the :LoadingLog functionality ( https://github.com/covidgraph/motherlode/blob/a2560ef1ffde48efba6bffce106e146fb0ec0e86/motherlode/main.py#L38 )
parallelize loading. Currently, all loaders run strictly sequential; loaders that don't depend on each other can be run in parallel (if the Neo4J instance and Docker host can handle the load)
YES! i will create an issue for that
Current Status
Motherlode is a proof of concept script at the moment. It works but the structure is not fitted for large scale expandability in future
Desired Status
Motherlode should be broke down to seperated classes and offer easy expandability and a more pleasent boarding for new devs
Tasks
[ ] Discuss possible structure/technologies with focus on future features [ ] Declare/Define and document structure [ ] implement changes
issues to take into account: https://github.com/covidgraph/motherlode/issues/8 https://github.com/covidgraph/motherlode/issues/7
hint: no-holds-barred: Change of plattform/language is possible if its serves the goal. Discussion is open