Exploratory project to Our goal is to catalog and evaluate datasets. We will determine ways to evaluate data files against the indicators above and offer solutions for increasing their quality. We aim to translate best practices into workflows that help with everyday use cases.
0
stars
0
forks
source link
Get initial counts of dataset usage from 3 repositories #7
This task changed to getting monthly counts of datasets from Dryad, Zenodo, Figshare.
I created 4 different colabs to do this (one for each of the three and 1 for DataCite).
https://colab.research.google.com/drive/1AvP0jxZwHL9bUB1IwZ7GP-TIpyNI2edP#scrollTo=4CV9AbKz8yEX
https://colab.research.google.com/drive/1z-5_f5XfTGZmonCwnrpmNjhNtIi9Q5KW#scrollTo=0juDiHEYrA9Z
https://colab.research.google.com/drive/1oNKHjafyMCpfgj8JsDtLrZGQTOTf_N-O#scrollTo=TvBokmBCZo5E
https://colab.research.google.com/drive/1nBTEhBYA-Z8cWQ9zaj5F3MGjkt1hePvJ#scrollTo=caPfrjfoNrmE
Found problems in the APIs and also the DataCite data do not seem to line up very well with what the repositories report for similar time periods.
I created a document that explores these things (work in progress). https://docs.google.com/document/d/1NH_iy0y4HDOstSCuCRHIDbSx3afsp75zo8-UtKvpD8A/edit