force11 / force11-scwg

Force11 Software Citation Working Group
https://www.force11.org/software-citation-working-group
37 stars 58 forks source link

What should we say about "Software Papers" #68

Closed jameshowison closed 8 years ago

jameshowison commented 8 years ago

I think a key unresolved question is how to address the practice of "software papers".

If a piece of software has a "software paper", should that be:

  1. cited on its own (supercede the software citation itself,
  2. cited in addition to the software citation itself,
  3. not cited; only cite the software itself (discourage software papers).

I'm not really sure, but I think my vote is for 2, although I acknowledge that that then creates two citations, exacerbating the "too many references" issue.

kyleniemeyer commented 8 years ago

This is a challenge, because many people may prefer the citing of the paper, because that still counts more in terms of academic credit (although we are obviously working to change that).

I've seen one solution, which goes along with your option 2, offered by a few groups, that goes something like:

Note that these two are not mutually exclusive, so you could have situations where it's appropriate to do one or both.

kyleniemeyer commented 8 years ago

I should add, that citing the paper may also be appropriate when you're referring to methodology described in the paper, even if it's employed in the software—so this practice would really leave citing the software for when it is used directly.

jameshowison commented 8 years ago

Good points. Perhaps we can suggest a hybrid: cite the paper but also include the permanent identifier of the specific piece of software. Ug, I can see lots of problems with that, since it would imply a citation with two DOIs which I imagine many styles and editors would disallow.

I think we could have a principle: Citing a "software paper" is not a substitute for citing the software directly.

With a preamble or comment along the lines of "It is often appropriate to cite a software paper" and mention the cases you highlight, Kyle.

Was this topic dealt with in the data principles?

danielskatz commented 8 years ago

While I think this is an interesting discussion and should go somewhere in the document, maybe in a section called "other issues" that we haven't yet written, I don't think it's essential in thinking about the principles themselves, which are for when someone does want to cite software. We already know how to cite papers.

migueldvb commented 8 years ago

I agree that this does not necessarily have to be in the citation principles section but it is an important issue that could be discussed elsewhere in the document. There are many software packages (at least in astronomy) where the authors of the software ask you to cite the peer-rewieved paper that describes the code which should be acknowledged, in addition to citing the software if the work makes use of it directly.

danielskatz commented 8 years ago

From Tim Clark, chair of Force11 Data Citation Pilot Project:

My view is that a “software paper” is a literature artifact, and the software itself is a “methods and materials” type artifact.

It is appropriate to cite literature when the text is important in the context of an assertion or description, for example, in describing what software DOES.

It is appropriate to cite the software itself in situations completely analogous to citing a biological reagent in methods and materials section.

jameshowison commented 8 years ago

Hmmmm, I don't know much about reagents ... are they created for academic credit? Do the people working on them seek academic careers? I suspect not (motivations are primarily commercial? If originally academic, mostly commercialized?)

Setting aside that it doesn't provide guidance for publications that don't have methods and materials sections (btw, I included any methods and materials sections in my content analysis, fwiw) I think this perspective is the source of the "Like Instrument" citations that I found.

I think we need to grapple with this perspective in depth, how would it affect the use-cases outlined?

Having said that the idea that "describing what software DOES" is a role for a software paper does make some sense to me.

--James

On Mon, Mar 7, 2016 at 6:32 AM, Daniel S. Katz notifications@github.com wrote:

From Tim Clark, chair of Force11 Data Citation Pilot Project:

My view is that a “software paper” is a literature artifact, and the software itself is a “methods and materials” type artifact.

It is appropriate to cite literature when the text is important in the context of an assertion or description, for example, in describing what software DOES.

It is appropriate to cite the software itself in situations completely analogous to citing a biological reagent in methods and materials section.

— Reply to this email directly or view it on GitHub https://github.com/force11/force11-scwg/issues/68#issuecomment-193232080 .

timclark commented 8 years ago

James

Biomedical reagents - by that is meant things like genetically engineered mouse models (e.g. https://www.jax.org/strain/008596 https://www.jax.org/strain/008596), cell lines and antibodies, are typically developed by researchers so they can execute an experiment. Funder requirements often mean the developers make them available to colleagues either directly or through commercial houses or non-profits (ATTC, JAX, etc.) and most commercial houses that offer them, offer lines donated by the investigators who derived them.

So again as an example, the engineered mouse model cited above as an example was created by David Holtzman, a leading Alzheimer’s researcher at Wash U St Louis - http://curealz.org/people/david-holtzman http://curealz.org/people/david-holtzman

His motivation in creating it was likely to do the research reported in this paper:

Wahrle SE; Jiang H; Parsadanian M; Kim J; Li A; Knoten A; Jain S; Hirsch-Reinshagen V; Wellington CL; Bales KR; Paul SM; Holtzman DM. 2008. Overexpression of ABCA1 reduces amyloid deposition in the PDAPP mouse model of Alzheimer disease. J Clin Invest 118(2):671-82. PubMed: 18202749 http://www.ncbi.nlm.nih.gov/pubmed/18202749MGI: J:131400 http://www.informatics.jax.org/reference/J:131400

From the abstract: "To address these hypotheses, we created PrP-mAbca1 Tg mice that overexpress mouse Abca1 throughout the brain under the control of the mouse prion promoter. "

Note that I’ve just kind of randomly selected this mouse strain from the Jackson Labs database. It is typical of one class of biological reagent.

Both of these categories, biological reagents and experimental equipment, are part of “methods and materials” which have to be properly described to ensure replicability of an experiment and its output data.

Tim

On Mar 7, 2016, at 1:33 PM, James Howison <notifications@github.com mailto:notifications@github.com> wrote:

Hmmmm, I don't know much about reagents ... are they created for academic credit? Do the people working on them seek academic careers? I suspect not (motivations are primarily commercial? If originally academic, mostly commercialized?)

Setting aside that it doesn't provide guidance for publications that don't have methods and materials sections (btw, I included any methods and materials sections in my content analysis, fwiw) I think this perspective is the source of the "Like Instrument" citations that I found.

I think we need to grapple with this perspective in depth, how would it affect the use-cases outlined?

Having said that the idea that "describing what software DOES" is a role for a software paper does make some sense to me.

--James

On Mon, Mar 7, 2016 at 6:32 AM, Daniel S. Katz <notifications@github.com mailto:notifications@github.com> wrote:

From Tim Clark, chair of Force11 Data Citation Pilot Project:

My view is that a “software paper” is a literature artifact, and the software itself is a “methods and materials” type artifact.

It is appropriate to cite literature when the text is important in the context of an assertion or description, for example, in describing what software DOES.

It is appropriate to cite the software itself in situations completely analogous to citing a biological reagent in methods and materials section.

— Reply to this email directly or view it on GitHub <https://github.com/force11/force11-scwg/issues/68#issuecomment-193232080 https://github.com/force11/force11-scwg/issues/68#issuecomment-193232080> .

— Reply to this email directly or view it on GitHub https://github.com/force11/force11-scwg/issues/68#issuecomment-193385010.

kyleniemeyer commented 8 years ago

Well, I think the question that James really raised is (or the way I see it): are such things equivalent to software? I think many of us see software as more than data.

That said, I don't disagree with the idea that in a paper, you would cite the software you used in the methods section, while you would cite a software paper in a discussion either of how the software works (algorithms, etc.) or certain results presented in that paper.

kyleniemeyer commented 8 years ago

OK—reading Tim's description again, it's clear he was talking about artifacts that are more than data themselves.

danielskatz commented 8 years ago

I've now added a paragraph in section 5 to capture this discussion - comments are welcome, ideally as pull requests to the text, but also ok here. I'll leave this issue open for a bit.

ScottBGI commented 8 years ago

One comment, but this subsection makes it sound like software papers are a new thing. Its nice the examples used (F1000Research and AAS) show it becoming more mainstream in recent years, but you might want to include more established examples. Just in the life sciences, Bioinformatics (formerly Computer Applications in the Biosciences) has been publishing software articles since 1985 (see: http://bioinformatics.oxfordjournals.org/content/1/1.toc), Nucleic Acids Research has been publishing databases and webservers since at least the mid-90's, and BMC has being publishing software articles since they launched in 2000.

danielskatz commented 8 years ago

I've added a little to the text in 4 to address this.

kyleniemeyer commented 8 years ago

I added Computer Physics Communications to that new text in 1fbc8c3c06eedc92a0d2d694471ba3f988242eda, but otherwise think the discussion is good.

SRoffel commented 8 years ago

I like this discussion, but its unclear to me is what the definition of a "software paper" is ....

is it a regular paper about a software project (lots of text on performance, validation, etc ) ? if so there are , and have been, many of those , and they usually contain some (outdated) link to the actual software . (btw - some journals try to solve the "dead link to software" issue by archiving the software described, like Computer Physics Communications. )

or is it an short scientific communication that uniquely provides the software itself with a citation interface to the rest of the academic literature ? if so this is an emerging thing - especially when it relates to post initial publication versioning .

btw 1 I would agree with Tim that software is similar to methods and materials : Software is method executed by a machine. for reproducibility of science that type of software should be an integral part of the academic record (not really interested in the word processor software used)

Btw 2 also agree with Kyle that software is more than data: its computation done with data by a machine.

Sweitze

danielskatz commented 8 years ago

I think "software papers" can include papers that match both definitions.