Closed hafen closed 5 years ago
@hafen Is this issue still valid? Do we need this new method?
Right now it looks like there is a class member data_path
that holds the data path as a string, which I think is sufficient convenience for the user to use to construct paths with. What do you think?
There is another method (although a bit long) that will give you the full path for a data type:
>>> p.data_type_to_project_path('core')
'/tmp/testproj/data/core'
We can add this new method if you think it will be used. Or we can close this issue and if users start asking for something similar we can add it.
To help users save files in the correct place, since we are not implementing a generic
data_save()
, I'm thinking it would be good to have a utility function, something likeproject.data_path(name, data_type = "derived")
as a helper so thatdata_push()
is happy. It would take an input, sayname = "mydata.csv"
and return a string_path_to_project_/data/derived/mydata.csv
. This way users can write their own code to save data however they like, withNote that the default should be "derived" since this is the most common use case. It's extremely rare (probably only in the case of the data curator) that someone would push a core dataset.
Also, note that the
data_type
should always be inherent in the path. If data_push ensures that the path is in project/data/core, project/data/discovered, or project/data/derived, then we don't need to ask the user for data_type.