Closed sierra-moxon closed 5 months ago
Thanks @sierra-moxon !
Still throwing an error from prepare_output_args()
in cli_utils.py
- I think I have a fix
False alarm - I was running the incorrect kgx version again.
But I'll take this chance to add material on parquet sink to the docs.
I see parquet
as an available output option now as expected:
$ poetry run kgx transform --help
Usage: kgx transform [OPTIONS] [INPUTS]...
Transform a Knowledge Graph from one serialization form to another.
Options:
-i, --input-format TEXT The input format. Can be one of ('tsv',
'csv', 'graph', 'json', 'jsonl', 'obojson',
'obo-json', 'trapi-json', 'neo4j', 'nt',
'owl', 'sssom', 'parquet')
-c, --input-compression TEXT The input compression type
-o, --output PATH Output
-f, --output-format TEXT The output format. Can be one of ('tsv',
'csv', 'graph', 'json', 'jsonl', 'obojson',
'obo-json', 'trapi-json', 'neo4j', 'nt',
'owl', 'sssom', 'parquet')
[snip]
Should this be working now or no?
$ poetry run kgx transform -f parquet -o tempout tests/resources/rdf/test1.nt
[KGX][__init__.py][ transform_wrapper] ERROR: kgx.transform error: Type None not yet supported
It appears to be working for me:
~/kgx$ poetry run kgx transform -f parquet -o tempout tests/resources/rdf/test1.nt -i nt
[KGX][rdf_source.py][ parse] INFO: Done parsing tests/resources/rdf/test1.nt
The distinction being that you'll still have to specify the input format, too
Excellent! Thanks @caufieldjh works for me too:
$ poetry run kgx transform -f parquet -o tempout tests/resources/rdf/test1.nt -i nt
[KGX][rdf_source.py][ parse] INFO: Done parsing tests/resources/rdf/test1.nt