Closed jsstevenson closed 1 month ago
@jsstevenson did you see this old issue #210?
one possible additional todo: add -o to things that produce file outputs to specify output location
@korikuzma having some second thoughts about what commands/combinations of commands should call graph.clear()
@korikuzma having some second thoughts about what commands/combinations of commands should call
graph.clear()
@jsstevenson I haven't looked at the changes, but what if we just separate this out and place it on the user to decide and make a note in the documentation?
close #210
Miscellaneous quality of life improvements and feature additions to CLI:
metakb
as a console command.metakb update
to run a complete harvest/transform/load stepmetakb check-normalizers
andmetakb update-normalizers
to check and force refreshing of normalizer data. This supports a simple workflow likemetakb check-normalizers || metakb load-normalizers
to load if unavailable, rather than requiring the user to force normalizer reload while loading the MetaKB graph.metakb harvest
to just perform harvest of source(s)metakb transform
to just perform transform of source(s), ormetakb transform-file
to transform a specific harvested filemetakb load-cdm
to skip harvest/transform and directly load a CDM file, either from local (default location), a specific file, or from S3metakb clear-graph
to wipe the graph. No other CLI command will wipe the graph. I thought about calling it whenupdate
is used without any source qualifiers, but it seemed a little odd to include additional behavior such thatmetakb update <source> && metakb update <other source>
is different frommetakb update
. Also thought about including it as an option flag in some other commands, but at that point, you can just dometakb clear-graph && <other command>
.--output_directory
,-o
) where it makes sense. Unfortunately, most of these commands all producen
output files so I don't think there's a simple way to specify the name of the output file.username:password
option, since you need to provide both at once. (Not sure why Neo4j requires a password).cli.py
act as gateways/interfaces to them. There's probably a little bit more of this that we could do but nothing else stuck out to me.