ki-tools / kitools-py

Tools for working with data in Ki analyses
Apache License 2.0
3 stars 0 forks source link

data_load_csv and data_save_csv #4

Open hafen opened 5 years ago

hafen commented 5 years ago

It would be great to implement loading and saving data for what I would imagine would be the most commonly used file type, csv. I envision these functions working in a way like this:

data_load_csv([arguments to data_pull], [arguments to read csv])
    data_pull([arguments to data_pull])
    read_csv([arguments to read csv])

data_save_csv([arguments to data_push], [arguments to write csv except file path])
    check local_path to make sure it has .csv extension
    write_csv([arguments to write csv + local_path])
    data_pull([arguments to data_push])

There are some good reasons to add these:

  1. Convenience for the user - they don't have to separately save the file and push, while also worrying about the path for where the data goes being correct.
  2. The convenience will help push users toward using csv whenever possible, a good standard to encourage.
  3. We can add logic into the save function that automatically computes metadata about the data, since we will know the data in this case is in tabular form. For example, we can compute number of records, summary stats for each of the variables, etc., and push this metadata to the provider, which is an important goal of this package.

It seems that these would be relatively straightforward to implement.