ki-tools / kitools-py

Tools for working with data in Ki analyses
Apache License 2.0
3 stars 0 forks source link

Add data_path convenience function #8

Closed hafen closed 5 years ago

hafen commented 5 years ago

To help users save files in the correct place, since we are not implementing a generic data_save(), I'm thinking it would be good to have a utility function, something like project.data_path(name, data_type = "derived") as a helper so that data_push() is happy. It would take an input, say name = "mydata.csv" and return a string _path_to_project_/data/derived/mydata.csv. This way users can write their own code to save data however they like, with

path = project.data_path("optional/subdirs/mydata.csv", date_type = "discovered")
my_data_save_func(mydata, path = path)
project.data_push(path)

Note that the default should be "derived" since this is the most common use case. It's extremely rare (probably only in the case of the data curator) that someone would push a core dataset.

Also, note that the data_type should always be inherent in the path. If data_push ensures that the path is in project/data/core, project/data/discovered, or project/data/derived, then we don't need to ask the user for data_type.

pcstout commented 5 years ago

@hafen Is this issue still valid? Do we need this new method?

hafen commented 5 years ago

Right now it looks like there is a class member data_path that holds the data path as a string, which I think is sufficient convenience for the user to use to construct paths with. What do you think?

pcstout commented 5 years ago

There is another method (although a bit long) that will give you the full path for a data type:

>>> p.data_type_to_project_path('core')
'/tmp/testproj/data/core'

We can add this new method if you think it will be used. Or we can close this issue and if users start asking for something similar we can add it.

pcstout commented 5 years ago

Dupe: https://github.com/ki-tools/kitools-py/issues/23