Open only1chunts opened 6 years ago
see also comments on issue #161
the number of clicks (google analytics) on a dataset page
number of direct downloads from GigaDB,
citations,
any social media mentions.
We should also look into ways of tracking downloads from the FTP server(s) as well as number of times the dataset is discovered by searches.
perhaps look at datacite for their analytics to see how it compares to google etc.
See the DataCite stats page for their analytics: https://stats.datacite.org/
make data count have released version 1:
https://github.com/CDLUC3/Make-Data-Count/blob/master/getting-started.md
there is also a blog about it: https://makedatacount.org/2018/06/05/its-time-to-make-your-data-count/
Make data count is now utilising with Event Data, so we can pull citation metrics using the API. E.g. see https://api.datacite.org/events?doi=10.5524/100008
As you can see very few datasets are getting correctly cited but you might want to display true citations and the googlescholar/europepmc full text workarounds. Especially for the older citations that were before the Data Citation Principles even existed. Event Data and the API lets you retrospectively add the missing data citations in the referees, so I don't know if updating retrospective metadata is a useful thing to try to automate or curate (student project?).
A new open source approach to tackle this is also CiteAs: http://citeas.org/
just had a look at Event Data and it looks like we could make use of that, its a simple API that we can call for "events" involving GigaDB DOI's. It can be called for ALL GigaDB by prefix 10.5524:
curl "https://api.eventdata.crossref.org/v1/events?mailto=YOUR_EMAIL_HERE&rows=10000&obj-id.prefix=10.5524" > gigadb.json
The issue with that might be that it will include reviews which we current mint DOI's for with the format 10.5524/review.nnnnnn
Alternative could be to call each dataset individually:
curl "https://api.eventdata.crossref.org/v1/events?mailto=YOUR_EMAIL_HERE&rows=10000&obj-id=10.5524/100001" > gigadb-100001.json
this might get to heavy unless its done on-demand when someone tries to look at the event data for a particular DOI from the website.
To do that we would need a html page that can call and parse the json from event data, and display it in a useful way.
You should check out what the Impact Story people are building on this data. On top of CiteAs that I mentioned (https://github.com/Impactstory/citeas-api) see also PaperBuzz (https://github.com/Impactstory/paperbuzz-api).
Thanks Scott, it might be worth keeping an eye on paperbuzz, but at present they dont seem to do anything?! it looks like its meant to list the "hits" but actually gives a blank page, e.g. https://paperbuzz.org/details/10.5524/100001 CiteAs is just a tool to provide the citation of something in a variety of formats, so not what we need for this ticket.
There is now a documented guide to becoming COUNTER compliant: https://www.projectcounter.org/code-practice-research-data/
GBIF have now built a citation widget if we have data also in GBIF (one for the future if we manage to integrate) and could also potentially be adapted into a tool for our data (I assume its DataCite - event data) https://www.gbif.org/article/1E6v02SFQyhupvB7JqDXPN/citation-widget
As an author /submitter I want to be able to see the usage stats of my dataset (how many views, downloads, tweets, citations etc..) So that I can show my employer/funder that the work was worthwhile
As a GigaDB admin I want to be able to demonstrate the advantages to open data sharing so that we can encourage more people to do it
As a funder I want to be able to see data that I have funded that has been published and shared so that I can verify that the funding was put to good use
Cobalt Metrics might be another option to look at for how to get hold of metrics for our datasets: https://cobaltmetrics.com/digests Its a paid-for service, but if you believe their website they are better and cheaper than Altmetrics.
also worth throwing in the mix, plumX metrics: https://plumanalytics.com/learn/about-metrics/ NB - This is part of the Elsevier group of companies.
DataCite are looking for Beta testers of their datacite usage tracker tool, it has two parts:
User Story
Multiple user stories for this depending on your point of view:
Additional information
We need to think about improving the metrics for usage of our datasets, the COUNTER code of preactice for Research Data is addressing how this should be done: https://docs.google.com/document/d/1n1LsS3suFNnnYfqltf3Qjaup0taKu-q54Kico_IHXdY/edit# we should be a part of that project as early adopters.
This is also linked work on #161 and #14