Open arokem opened 7 years ago
In the future, when we are doing analyses of the kind that went into the first-year summary, it might be useful for us to have some metadata that is currently hard to find: not only the language of implementation of the software, but also things like the research domain to which it contributes, etc.
Yeah, that would be really interesting. Detecting programming language should be trivial with something like https://github.com/github/linguist
Is there an ontology of research software we could use and ask people to select from?
I'm not aware of a research software ontology but there are subject taxonomies available e.g. this one: https://github.com/PLOS/plos-thesaurus
It would be useful to even add a single "Keywords" field in the submission form, with a text prompt saying something like "Please enter keywords for the research domain to which this software contributes, using the PLOS subject taxonomy [link]."
related issue #677
Relating to this, here are two things:
There seems to be some sort of taxonomy for reviews by research domain (group): the review tracks, see screenshot missing the "Misc" track). Couldn't these be reused for a preliminary classification taxonomy by research domain? Mapping to other taxonomies could be done by downstream applications for now.
Looking at Crossref metadata for JOSS publications, I found that the "subject" that is recorded there is set to "General Earth and Planetary Sciences" + "General Environmental Science", even though publications are clearly in another domain, e.g., educational research (https://joss.theoj.org/papers/10.21105/joss.05742, first screenshot) or linguistics (https://joss.theoj.org/papers/10.21105/joss.04825, second screenshot). I'm not sure if these values are provided by JOSS or added by Crossref somehow per default, e.g., for the journal as a whole? From a publication mining perspective it'd be great if any subject metadata for the paper could be "put there" instead :).
In the future, when we are doing analyses of the kind that went into the first-year summary, it might be useful for us to have some metadata that is currently hard to find: not only the language of implementation of the software, but also things like the research domain to which it contributes, etc.
Is there an ontology of research software we could use and ask people to select from?