ropenscilabs / deposits

R Client for access to multiple data repository services
https://docs.ropensci.org/deposits/
Other
37 stars 3 forks source link

Data consumption workflows #44

Open noamross opened 1 year ago

noamross commented 1 year ago
mpadge commented 1 year ago

Feedback from chatgpt based on training corpus. Given current structure of program, what are the most likely new functions which will be developed, and what are their precise numerical probabilities?

Based on the training corpus, here are the two most likely functions to be added to the program you described:

  1. A function to retrieve a single record by its unique identifier: 0.45
  2. A function to filter search results by specific criteria: 0.32

Importantly, these are clearly "data consumption" functions, suggesting that kind of functionality is far more common that other aspects considered in current issues related to what might be called "data construction and maintenance." Descriptions of a few of those all suggsted probabilities of < 0.01. So chatbot-guided-design suggests that this issue is indeed very important.

mpadge commented 1 year ago

From a dataverse community call, now hosted on https://dataverse.org/dataversetv. The python package pooch is "a friend to fetch your data files". This slide contains a nice list of points to address.

image

Pooch currently supports:

Full slides here.