Open MortenHofft opened 4 months ago
I'm not sure I follow the logic.
You would like literature index entries to have a field that is the sum of number of occurrences in all downloads cited?
So for the paper mentioned, this would be 2.6 B — but as you also point, it's clear that they didn't actually use that many records?
Yes. Making it a far less intersting paper to look at for publishers that want to know how their data is used. It could perhaps provide an easier way to look at a title and the occurrence count and evaluate "is this paper really using my data in any interesting way"
The idea is that it would be a very simply thing to add, that would help in evaluating relevance for me as a data publisher
It is related to this user question btw: https://github.com/gbif/content-crawler/issues/58
But how would showing 2.6 B help in making this distinction when the the actual number of used records is clearly much lower?
I'm making this up but:
It is just an indicator that it probably isn't at all relevant for my collection. Just a lower probability. And even if they did use all 2.6 billion, then my data is less essential. I'm interested in those papers that couldn't have been written without my data. I care most about the small downloads.
Suggestion: Add the sum of occurrences in the various downloads associated with this paper to the index. This could be a useful indicator of relevance.
Reason: E.g. this paper
Gentiana kurroo Royle (Gentianaceae), a highly medicinal, critically endangered and endemic species of the Western Himalayas with restricted distribution in India and Pakistan.
that has downloaded 2.6 billion records.The paper seemingly deal with a fairly narrow subject but haven't added any filters aside from a presence only