An Aurum CLI module aurum_cli.py that aims to make the Aurum workflow easy and straightforward (especially for newcomers)
Through the CLI module one can:
Manage data sources through the command line (or interactively), without having to edit .yml files.
Run profile jobs by selecting specific data sources identified by a name.
Manage models (identified by name).
Export models to other formats & databases (currently Neo4J).
Initiate discovery sessions to run Aurum algebra queries interactively
CLI Documentation
Detailed documentation for the CLI can be found at aurum-cli.md
Neo4J Export
A refactored Neo4J export module that is cleaner, faster, simpler and more complete (exports all Relation types, not only CONTENT_SIM). The export process is also monitored through a tqdm-powered progress bar.
This could also serve as a template for other backends too.
API Refactors
Adds another kwd as_str=False argument knowledgerepr/fieldnetwork.py:enumerate_relation(self, relation, as_str=True) to allow method clients to get tuples of Hit pairs. Currently they could only get concatenated str and had to rely on regexes to extract tuples (making the prior Neo4J module too slow)..
This change is backwards compatible
Bug Fixes
Fixes a minor bug introduced by https://github.com/mitdbg/aurum-datadiscovery/pull/126 and caused when init_system(<path_to_serialized_model>, create_reporting=False). Also updates the relevant documentation in quickstart.md
This PR includes the following:
Aurum CLI
An Aurum CLI module
aurum_cli.py
that aims to make the Aurum workflow easy and straightforward (especially for newcomers) Through the CLI module one can:.yml
files.CLI Documentation
Detailed documentation for the CLI can be found at
aurum-cli.md
Neo4J Export
A refactored Neo4J export module that is cleaner, faster, simpler and more complete (exports all Relation types, not only
CONTENT_SIM
). The export process is also monitored through atqdm
-powered progress bar. This could also serve as a template for other backends too.API Refactors
Adds another kwd
as_str=False
argumentknowledgerepr/fieldnetwork.py:enumerate_relation(self, relation, as_str=True)
to allow method clients to get tuples ofHit
pairs. Currently they could only get concatenatedstr
and had to rely on regexes to extract tuples (making the prior Neo4J module too slow).. This change is backwards compatibleBug Fixes
Fixes a minor bug introduced by https://github.com/mitdbg/aurum-datadiscovery/pull/126 and caused when
init_system(<path_to_serialized_model>, create_reporting=False)
. Also updates the relevant documentation inquickstart.md