NASA-Openscapes / earthdata-cloud-cookbook

A tutorial book of workflows for research using NASA EarthData in the Cloud created by the NASA-Openscapes team
https://nasa-openscapes.github.io/earthdata-cloud-cookbook
Other
87 stars 32 forks source link

Refine "How Do I..." section for R #318

Open ateucher opened 6 months ago

ateucher commented 6 months ago

The "find data" needs updating, and perhaps we can extend by adding R content to the "access data in the cloud/locally", as well as the read and subset data sections. Chatting with @cboettig I think this can largely (entirely?) be done without reticulate/earthaccess, which I think is a friction point for R users.

Related to searching: https://github.com/boettiger-lab/earthdatalogin/issues/10 and reading/subsetting: https://github.com/boettiger-lab/earthdatalogin/issues/9

ateucher commented 6 months ago

@cboettig I know you're not a huge fan of emphasizing edl_search() as your opinion is that using the STAC API with rstac is a better general approach for searching for and accessing this kind of data (I think I have that right?). That said, it very nicely parallels earthaccess.search_data() in Python, so I do think it's worth talking about in the "How do I find data in R" section, in addition to using rstac. Does it need more work before you want to submit to CRAN with it included? What are our feelings about including development versions of packages in the cookbook?

cboettig commented 6 months ago

Yup, :100:. I'm not against documenting the edl_search() route, the difficulty is that I think my edl_search() needs improvement before it's even a half-way decent user experience. I don't really understand the CMR API very well, and I think the way I've done edl_search() will fail on cases where it ought to work (e.g. DOI searches I think?) only because I haven't understood the API properly.

One thing that probably is worth documenting though is that users can just use the https://search.earthdata.nasa.gov/ and get the URLs directly from there. after doing edl_netrc(), those URLs should "just work" like any other URL.

(This also echoes my philosophy that earthdatalogin is 'just authentication', NASA data is 'just data' and cloud access is 'just URLs', there's no black boxes and no magic)