Closed dmitrii-ubskii closed 2 years ago
The changes
section of the PR can be improved a bit to include slightly more detail of what was implemented. Bullet points are fine!
And finally, the title might do better as Introduce Cataloge of Life example using TypeDB Loader
Left one comment, and the one thing we're missing is adding the tests to CI: see here: https://github.com/vaticle/typedb-examples/blob/b9f182237ec40d7ad7b6eae4ce89e4694ccd7d55/.grabl/automation.yml#L42
this is our own CI system (called Grabl/Factory) that runs jobs as defined in this yaml file :)
What is the goal of this PR?
We add an example using TypeDB-Loader. The example uses the Catalogue of Life dataset to represent the taxonomic structure and some horizontal relations (e.g. geographical distribution). We aim to preserve as much data as possible during migration, within reason.
What are the changes implemented in this PR?
Loader
fetches the dataset data off the web (~500 MB compressed), inflates and prepares it for typedb-loader to process and upload into the catalogue-of-life database. The schema includes rules that should make for interesting inference examples.Order of operations:
prepareData()
), so it can be loaded in two passes for two different kinds of entity (marine region by mrgid or description);