How to compute "reach" or digital impact

How can we define and compute digital reach/impact? For example, individual datasets or downloads from an aggregator may be cited by papers working directly on that data, which is direct and simple metric ("how many times has this dataset been cited?"). But imagine someone does a meta-analysis of various published studies, citing those papers (which themselves cite the aggregator). At present, citing a paper that uses data signals that the paper has "value", but this does not get transmitted further down the chain to the data set itself.

For example, I recent discussed this on the GBIF community forum, pointing out that the policy paper A Global Deal For Nature: Guiding principles, milestones, and targets published in Science Advances https://doi.org/10.1126/sciadv.aaw2869 does not mention GBIF(!) nor does it cite any GBIF data. As Tim Hirsch pointed out "A Global Deal For Nature" cites at least one paper that in turn cites GBIF data (A global test of ecoregions Nat. Ecol. Evol. 2, 1889–1896 (2018) https://doi.org/10.1038/s41559-018-0709-x).

One way to tackle this problem is to move beyond simply counting citations, and instead create the citation graph (e.g., adding papers that cite data, then add papers that cite papers that cite data, and so on). Then we could use a measure such as PageRank to compute a measure of the impact of a GBIF dataset (or a dataset hosted by any other aggregator). PageRank is the basis of the original Google Search Engine (the "Page" in PageRank is Larry Page). For an example of using PageRank to measure impact, see The Pagerank-Index: Going beyond Citation Counts in Quantifying Scientific Impact of Researchers https://doi.org/10.1371/journal.pone.0134794.

A related concept is "transitive credit", see

Katz, D. S. (2014, February 10). Transitive Credit as a Means to Address Social and Technological Concerns Stemming from Citation and Attribution of Digital Products. JORS. Ubiquity Press, Ltd. http://doi.org/10.5334/jors.be

The idea of transitive credit is as follows: The credit map for product A, which is used by product B, feeds into the credit map for product B. For example, product A is a software package equally written by two authors and its credit map is that 50 percent of the credit for this should go the lead developer, 20 percent to the second developer, and 10 percent to the third developer. In addition, 5 percent should go to each of the four libraries that are needed to run the code. When this product is created and registered, this credit map is registered along with it. Product B is a paper that obtains new science results, and it depended on Product A. The person who registers the publication also registers its credit map, in this case 75 percent to her/himself, and 25 percent to the software code previous mentioned. Credit is now transitive, in that the lead software developer of the code can be given credit for 12.5 percent of the paper. If another paper is later written that extends the product B paper and gives 10% credit to that paper, the lead software package developer will also have 1.25% credit for the new paper.

rdmpage / australia

How to compute "reach" or digital impact #3