DeepBlueCLtd / tac_store

data store
Apache License 2.0
0 stars 1 forks source link

Default values from config files #22

Open IanMayo opened 5 years ago

IanMayo commented 5 years ago

As a data scientist I want to provide config files of default values in order to compensate for missing fields in data-files

I discussed with a scientist how we can maintain referential integrity in the database if we have missing fields.

The customer's favoured solution is for a file of default values to be provided.

When the import process runs, if there is missing data, it will be reported back to the user. The user will add rows to the config file, and then re-run the import process.

Hopefully the import won't terminate at the first problem, but will build up a series of errors.

As time goes by, less and less data will be missing from the database, and less effort will be required for the config file.

The customer and I did discuss the benefits of .csv as config file format, since Excel can then act as a structured editor.

But, Python allows for more structured data (including default values) through the support for Microsoft's .ini files: https://docs.python.org/3/library/configparser.html#module-configparser

So, I propose that our importer is modified to take an import parameter that is the name of the config file to parse. If a name isn't provided, it will look for a import_config.ini file, and process that.

We could simulate the presence of a config file by allowing default values to be provided either in the command line, or hard-coded into the script.