Problem: I would to control the name of the output file for more consistency when loading results into downstream systems. Currently, the tool outputs a pseudorandom guid ie. 7ceb804f84d64c81a46b388b51400572-0.parquet
Proposed solution: Control the output name by leveraging the keyword argument basename_template already contained in the pyarrow.parquet.write_to_dataset method called by the Data Exporter __writer_process__. Supply a value to this argument via a new optional string command-line argument named --basename_template for the focus_converter convert command.
Alternatives: The CLI option could instead be supplied through some kind of configuration or environment variable then parsed if available. This seems unintuitive given the current usage style of the tool.
Problem: I would to control the name of the output file for more consistency when loading results into downstream systems. Currently, the tool outputs a pseudorandom guid ie.
7ceb804f84d64c81a46b388b51400572-0.parquet
Proposed solution: Control the output name by leveraging the keyword argument
basename_template
already contained in thepyarrow.parquet.write_to_dataset
method called by the Data Exporter__writer_process__
. Supply a value to this argument via a new optional string command-line argument named--basename_template
for thefocus_converter convert
command.Alternatives: The CLI option could instead be supplied through some kind of configuration or environment variable then parsed if available. This seems unintuitive given the current usage style of the tool.