Open MortenHofft opened 2 months ago
I'm sure this would be insanely expensive, but each literature entry could have "derived dataset"-like metadata, adding counts and perhaps fractions to the metadata, e.g.
from
"gbifDatasetKey": [
"4fa7b334-ce0d-4e88-aaae-2e0c138d049e",
"38b4c89f-584c-41bb-bd8f-cd1def33e92f",
"8a863029-f435-446a-821e-275f4f641165",
etc.
to
{
"gbifDatasetKey": {
"4fa7b334-ce0d-4e88-aaae-2e0c138d049e": {
"count": 67045764,
"fraction": 0.693
},
"3b894fe4-c13c-4a04-b372-4e749ce102e1": {
"count": 5753111,
"fraction": 0.0594
},
"8a863029-f435-446a-821e-275f4f641165": {
"count": 3107077,
"fraction": 0.0321
},
}
}
this would then also have to be done by publisher... 🤯
I'm not sure how this could be done and perform well, but there has been a request to sort results by relevance for a given publisher or dataset. So e.g. by how many records from a given publisher was downloaded for the data used by that paper.