lter / scicomp

LTER Scientific Computing ("SciComp") team website
https://lter.github.io/scicomp/
Creative Commons Zero v1.0 Universal
1 stars 0 forks source link

[Sci Comp] Develop R Function to Create User-Info JSON #12

Open njlyon0 opened 4 hours ago

njlyon0 commented 4 hours ago

Summary

Several groups have used local "syncs" of cloud storage systems to store their data (e.g., Box, Dropbox, etc.). This is not necessarily a problem but it does mean that relative file paths can't be used (or at least not for data ingestion). We've stumbled upon the idea of using a JSON to store user-specific information (including absolute file path to data and/or emails for simpler googledrive authentication) which seems like it'd work great in this context. However, so far we're prompting folx to build those JSONs manually which opens the door to user-error (and is itself a less-reproducible step). If we could develop a function to flexibly create a JSON with the desired contents that would be a really useful tool for some groups.

Starting Tasks

Useful links

njlyon0 commented 4 hours ago

Dream

Ideally, we'd like to end up with something like the following:

contents <- c("path_to_data" = "users/me/documents/working_group", "email" = "me@gmail.com")

write_json(x = contents, file = "user.json")
njlyon0 commented 3 hours ago

Substantial Progress!

Figured out how to create a JSON with custom contents from a named vector (uses RJSONIO::toJSON to do the transformation). Also developed a (simple) function variant and added the intuitive error checks (and unit tests). Pushed to ltertools repo (in dev/ directory) so will wait to see if GHAs complete without problem then work on integrating into package build.

Note too that it might be nice to expand the function to--if desired--add the file it creates to the .gitignore (if one can be found in the directory to which the JSON is exported). Would be a simple argument but less sure how simple it'd be to edit the .gitignore in place.

mbjones commented 40 minutes ago

@njlyon0 agree that path management is a pain. Are you familiar with the here package, which seems quite similar to the approach you are taking? https://here.r-lib.org/. It even comes with a great graphic from Allison Horst:

Also, we've written up a few thoughts on this issue of referencing data files in scripts, with EDI datasets as an example for Delta, using both pins and contentid packages to make things much more portable and reproducible. Hope its helpful:

Reproducible Data Access module