Open eeholmes opened 2 years ago
- Move DataDownload into contributors?
Since contributors is the sandbox and DataDownload was/is a group sandbox, maybe we move it to contributors for now? I have some things in DataDownload to clean up still but not going to do that immediately.
I think so, too.
- Where should we have shared data?
As suggested in #6, I think a ./data
directory would be a fine place for that purpose.
The following are types of data that would be good to have in a shared space. None of these are huge.
- Shared geojson files for different regions of interest.
The example given in #6 was exactly on ROIs.
- ATL03 data (in some not too huge but easy to read-in format) for good test cases. Those working on photon classification algorithms just need a set of ATL03 data to work on and would be good to have consistent set test cases. I don't know what's a good format. I think it'll be a geopandas dataframe? that we want to save. Or at least a pandas dataframe.
I think it makes sense to have a common place for things that are likely to be reused multiple times, and subsets for test cases are such an example. Other pieces of data specific to some example could reside for example in ./examples/data
.
I am also not sure about the best format. Perhaps something based on hdf? I don't know what's more typical in the ICESat world. Note that (geo)pandas dataframes are not file formats, those are data structures that only exist when running code. I guess they could be dumped in binary form—e.g., to a pickle—but I think that would not be very portable.
I can do the refactoring of DataDownload
and linked notebooks later this week or in the weekend.
Since
contributors
is the sandbox and DataDownload was/is a group sandbox, maybe we move it tocontributors
for now? I have some things in DataDownload to clean up still but not going to do that immediately.The following are types of data that would be good to have in a shared space. None of these are huge.