research-software-directory / RSD-as-a-service

This repo contains the new RSD-as-a-service implementation
https://research.software
22 stars 15 forks source link

Software mentions do not always refer to the software directly #1204

Open cmeessen opened 1 month ago

cmeessen commented 1 month ago

There are cases where Software mentions do not directly relate to the software itself. This is often the case if a mention had been scraped from a reference paper. Here is an example from a software in the Helmholtz RSD link:

This paper is listed in the mentions sections under Journal articles:

Monitoring underground hydrogen storage migration and distribution using time-lapse acoustic waveform inversion

It had been scraped from OpenAlex because it cites this paper:

Hydrogen generation by electrolysis and storage in salt caverns: Potentials, economics and systems aspects with regard to the German energy transition

that is listed as a reference paper of the software entry. In this paper, it is mentioned that the software had been further developed to contribute to the research goal of the paper, so it can be seen as a "Reference paper".

Since the scraped Journal article refers to the research output of the reference paper, it can not be treated equally to a mention that uses the software to create a research output. Lastly, this makes comparison of mentions between individual software entries difficult.

To mitigate this, but also respect the impact a software has on other research, the mentions could be categorised similar to projects:

1) output where the software had been described or further developed (Reference papers) by the contributors 2) citations of the software itself, e.g. by citation of the concept or version DOI 3) extended impact the software has on other research (citations of (1) or (2))

This is a first guess, maybe there are better distinctions which can be discussed in this issue.

jmaassen commented 1 month ago

The original idea behind reference papers was to use it for papers which only describe the software itself. A nice example is the Kernel Tuner paper: https://doi.org/10.1016/j.future.2018.08.004 This paper can directly be used a proxy for the software.

If the paper instead describes a research result which happens to be obtained using some software, the whole thing indeed becomes a bit fuzzy. One could argue it's not really a reference paper for the software, as it will be probably cited for the research results instead.

Nevertheless, we should improve the mention categories, as it is currently unclear. The proposed approach would not work for the cases where a true 'reference paper' is used though...