Open myrmoteras opened 3 years ago
how can we find out new sp. with collector affiliation? here is an approach: http://tb.plazi.org/GgServer/dioStats/stats?outputFields=doc.doi+bib.year+bib.source+auth.aff+treat.id+treat.status&groupingFields=bib.year+bib.source+auth.aff+treat.id+treat.status&FP-bib.year=2020&FP-auth.aff=%25Cromwell%25&FP-treat.status=%22sp.%20nov.%22&format=HTML
hat's a complex one because the authors of the paper might not be the authors of the new species in the paper
so I would make an API call to get all affiliations of authors, and authority names, of all treatments that has the status sp.n.
then I would remove the authors that are not in the authority name
and go from there
problem is, we don't mark parts of the affiliation, but the affiliation as a whole string
so we don't have, say, an attribute named 'institution' and another one named 'address'
this means that we can't already filter the institution using the API, we need external logic to accomplish that
(98/100) papers scheduled again
so, I see three steps in this service
sorry, four
for the extraction part, we don't cover 100% of the literature, so we might miss some n. spp. from a particular museum. we have to keep that in mind.
for the gathering part, this is done - the api is available and we can retrieve what I described.
the data manipulation is not complicated neither costy at all. there is no learning curve on my side (it's something that I've, to some extent, done before) and could be accomplish with any programming language.
the data visualization is the trickest part.
We live in the world of dashboards now, and we should start producing them to our 'clients' (publishers, museums)
but this is not something that any of us work with directly.
I've started, with google data studio
but there are other and better players in the market, like qlik sense and tableau
these are software that can get in live data and then display interact-able charts, like the one I did for EJT
the good part is, once we have done for one, we can adapt quickly for any other
now, side note
and what if we use 2021 to rebuild Plazi website and start bringing these stats to live? like, which museum has the most n.spp. in 2021, which museum published the most in closed access journals, and so on
that would not hamper the service (which breaks down the stats)
but would bring to life this competition
based on data
just like we go to that clarivate wbesite to get a sense on the most important jorunals for taxonomy
people would have to go to plazi to understand what's the most relevant institution and their publishing behavior
This is a very interesting use case for BLR: https://www.theguardian.com/environment/2020/dec/30/moths-to-monkeys-503-new-species-identified-by-uk-scientists
The report is very simple: The NHM published descriptions of 503 new species.
the use case too: Plazi provides the links to and the treatments of all the 503 new species.
eg. https://biolitrepo.org/?facets=true&journalYear=2020&page=0&q=NHM&resource=treatments&stats=true&type=all
The issue of course is a bit more complex, because