Closed milancurcic closed 7 months ago
add GDP 6-hourly dataset to the list
@milancurcic let's continue to do this using the list above and add
I don't think GLAD, and Laser experiments by themselves are really interesting/useful.. I would instead include this dataset from J. Lilly, which regroups all the experiments in the GoM.
https://zenodo.org/record/4421585
GulfDrifters is already a ragged-array dataset so I think it's a good candidate for a dataset accessor function (i.e. clouddrift.datasets.gulfdrifters()
). I don't think it needs an adapter.
However, GulfDrifters can't replace GLAD, LASER, SPLASH, etc. These specific datasets are available at 15-minute QC'd or 5-minute raw, and for some processes hourly is too coarse.
I understand, but there are like 50 of those experiments. 😆 If you want to do it, go for it, just thinking that having one example code could be enough.
I don't see adapters as examples of how to do it. Instead, they're functions to create cloud-ready ragged-array versions of these datasets that could then be made accessible via clouddrift.datasets
. Providing easy access to ragged-array datasets via a single function call would be one of the ways to adopt users. To illustrate: a future hypothetical user may search for "how to load GLAD data in Python", and the top result would point to a CloudDrift function.
Sorry, I was thinking in clouddrift-examples
.
If the data is on AWS/GCP, or already ragged, I guessed it's simple to add it to a datasets
. On the other hand, if it's a collection of dozens of randomly named csv
files, I wouldn't want to hardcore anything that can eventually break the library. That's why I started the examples, so people can just use the RaggedArray
class if they want to.
Great job on the MOSAiC dataset. Let's continue down the list.
The missing datasets have each been created as a separate issue
Some are already implemented in clouddrift-examples.