add mapping between table-names and csv-files

zargot commented 1 month ago

new spec

read CSV table-file relationships from tables.csv, located in the same dir as the first input file. The old way of inferring table from filename will still be valid.

filename example:

table,path
measures,a.csv
samples,b.csv

path example:

table,path
measures,./a.csv
samples,../other-tables/b.csv

Also, make sure that an appropriate error is raised if multiple files are specified for the same table.

old spec

requirements:

table names and filenames should be grouped in pairs for ease of association.
a single --input parameter for one pair is better than a pair of two separate parameters like --table & --file.
--input can have the short form -i.
the delimiter between table and file should not be = or : since they are commonly used between parameter and argument. The most intuitive alternative is probably ,.
the old way of inferring table name from filename should still be used as a fallback.

ex: odm-share schema.csv -i measures,a.csv -i samples,b.csv

zargot commented 1 month ago

instead of the above spec, we'll be adding a --tables=a,b,c parameter

zargot commented 1 month ago

the --tables param doesn't work well when specifying a directory as input (with a wildcard, as in mydir/*.csv). Imagine having 10 csv files and having to guess the order (which is alphabetic, but anyway), and write them all out. It's not very practical. A solution for this can be to have a mapping-file like tables.csv which specifies the relationships between tables and their filenames.

zargot commented 1 month ago

--tables should probably be scrapped since it's not practical to specify 10 tables every time you run the command (especially since we can't count on users knowing about terminal history). tables.csv should be enough, in addition to inferring table from filename.

Big-Life-Lab / PHES-ODM-sharing

add mapping between table-names and csv-files #56

new spec

old spec