widdowquinn / pyani

Application and Python module for average nucleotide identity analyses of microbes.
http://widdowquinn.github.io/pyani/
MIT License
192 stars 55 forks source link

Add a `CITATION.cff` file to make `pyani` easier to cite #317

Closed baileythegreen closed 3 years ago

baileythegreen commented 3 years ago

Summary:

New thing I just saw here: https://twitter.com/natfriedman/status/1420122675813441540.

Have not looked into the details yet.

widdowquinn commented 3 years ago

I think it's already at least as easy to cite as any paper or other resource ;) My opinion is that the issue is largely cultural. A .cff file wouldn't hurt, though.

Egon Willinghagen notes an automated service for generating .cff:

https://twitter.com/egonwillighagen/status/1420278201130049537

He also raises another point about software citations that isn't fixed by this technological solution:

https://twitter.com/egonwillighagen/status/1420282690704715777?s=20

GitHub .cff visibility assumes people see the repository, which probably isn't true for most users - they'll get the software from PyPI or conda. I'd expect we'll always have the most impact from a visible citation request in the output, and in the documentation.

baileythegreen commented 3 years ago

Maybe so, but when I'm trying to figure out how to cite software, my first stop is almost always the GitHub page, not to run the software.

Now, someone else looked into this a bit more last night, and told me that apparently the .cff can't point people to a paper, it's just a way to cite the repo, so it may not be particularly useful in its current form.

widdowquinn commented 3 years ago

If there's no other obvious information, I also will look at the GitHub page. I think that's natural for people familiar with GitHub. However, many of our users may not even know that GitHub exists - the software might be packaged up for them by a friendly local IT/bioinformatics person. We can't always rely on our own personal experience as a comprehensive guide. Everyone who uses the tool uses the tool. Not everyone who uses the tool uses GitHub.

I've just had a play with CITATION.cff on the pr_236 branch. I find the implementation to be underdocumented, and the same for the format specification (e.g. https://citation-file-format.github.io/ and https://github.com/citation-file-format/ruby-cff). It appears that some fields are silently compulsory (e.g. version) and others do not affect the way the information is presented in the sidebar or clipboard citation format. Also, the sidebar button to View citation file doesn't work, for me.

You can see my attempt at ffd9ddb - it can point people to a paper (it includes the DOI), but it doesn't provide a ready-to-paste citation in the clipboard, the way people might want.

My opinion is that the current implementation of this system at GitHub is currently less useful than a CITATIONS/HOW_TO_CITE file and putting the citation prominently in tool output.

I do think it's a worthy step towards getting people used to citing software directly as a research output, rather than citing the paper that reports the software (a roundabout, brittle system). I still think that cultural change is required in biology at least for this approach to take hold.

Right now, I'm not sure how to use this. It seems to make most sense to point the CITATION.cff to the Zenodo DOI - but asking for citations of the paper is still the cultural norm, and that's what we ask in the CITATIONS file and documentation.

baileythegreen commented 3 years ago

Also, the sidebar button to View citation file doesn't work, for me.

This takes me to the CITATIONS file in master. I don't know if that counts as 'not working', but somehow that doesn't feel like the intended behaviour….

widdowquinn commented 3 years ago

Maybe it takes time to catch up on the commit - I get APA/BibTeX copiable info, now:

Screenshot 2021-09-05 at 11 31 41
Pritchard, L., Glover, R. H., Humphris, S., Elphinstone, J. G., & Toth, I. K. Genomics and taxonomy in diagnostics for food security: soft-rotting enterobacterial plant pathogens (Version 0.3.0) [Computer software]. https://doi.org/10.1039/C5AY02550H
@misc{Pritchard_Genomics_and_taxonomy,
author = {Pritchard, Leighton and Glover, Rachel H. and Humphris, Sonia and Elphinstone, John G. and Toth, Ian K.},
doi = {10.1039/C5AY02550H},
title = {{Genomics and taxonomy in diagnostics for food security: soft-rotting enterobacterial plant pathogens}}
}

and as the "View Citations File" takes me to an actual file describing how we'd like the tool cited, I'm happy with it.

baileythegreen commented 3 years ago

This is also what I see, but I wasn't sure if it was different to what you were seeing when you first tried it. I would say that resolves the issue, and it can be closed, unless you want to wait until pr_236 is merged? (I don't know when that might be.)

widdowquinn commented 3 years ago

I think we can close it now - #236 works but is going to need a health warning for large jobs that may overrun SLURM job limits on the cluster being used.