Open tobiasdiez opened 3 years ago
For conversion between citation formats it would be worth pulling @mfenner into this thread too 👋
Hi, thanks for bringing us in.
Perhaps two initial comments: there is https://www.ctan.org/tex-archive/macros/latex/contrib/biblatex-contrib/biblatex-software ("the reference biblatex implementation of a bibliography style extension that includes software-specific BibTeX entries"), but I'm not familiar enough with biblatex
to say what the connection of these would be. Perhaps @rdicosmo (author of biblatex-software
) can help out here?
I gather that biblatex
is the base implementation upon which other packages build? If so then I agree, it'd be great to update that to include software-specific fields for a better support for software/@software
:+1:.
Background: We at JabRef are currently faced with the issue of importing CFF into bib(la)tex, and are unsure how to treat the metadata information with no equivalent fields. See JabRef/jabref#7946 for work in progress.
For CFF, we have decided to support very little more than just basic and advanced citation use cases, and I think for JabRef it'd be fine to drop any extra information that is not used for citation purposes as well? But happy to talk further in https://github.com/JabRef/jabref/pull/7946.
Thanks @sdruskat for the ping, and for pointing out the biblatex-software
package, that addresses the needs for citing software in biblatex.
biblatex-software
adds the relevant fields for software that are missing in the stock biblatex, and provides 4 different entries @software
, @softwareversion
, @softwaremodule
and @codefragment
, to enable citation of a software project (e.g. Scikig-Learn), a version (e.g. OCaml 4.09), a module in a modular software project (e.g. Voronoi diagrams 1.0 in CGal 3.02), or a code fragment (e.g.: the core mapreduce algorithm in Parmap 1.0).
A special property of biblatex-software
is that it is a "style extension", designed to add support for the software related entries and fields to any existing biblatex style. This allows to use it exactly as if these entries and fields were part of the stock biblatex, without requiring changes to the biblatex itself (see the documentation for the details).
biblatex-software
has been on CTAN for over a year, is part of TeXLive, and has already undergone several iterations following feedback from the user community.
My kind suggestion is to use biblatex-software
for handling software related entries, and when it will be fully stable, we can propose to incorporate it upstream in biblatex.
Feel free to contact me for any question you may have about it.
biblatex-software
is a very nice example of how to extend biblatex
for specialist areas. It would be nice perhaps to make such extensions "official" by mentioning them in the manual, which might help adoption for other formats requiring specialist interfaces. What do you think @moewew?
Just to have the link to my comment to a similar question in https://github.com/plk/biblatex/issues/1106#issuecomment-1220282384
Is there any consensus that biblatex-software
covers the CFF requirements? If so, I would suggest to close this. I don't think we are going to merge styles into the core of biblatex as this opens a whole can of worms. We can simply say that for CFF field support, load bibaltex-software?
Is there any consensus that
biblatex-software
covers the CFF requirements? If so, I would suggest to close this. I don't think we are going to merge styles into the core of biblatex as this opens a whole can of worms. We can simply say that for CFF field support, load bibaltex-software?
I would slightly prefer to see biblatex-software
merged, as it would nicely complete the official @software
entry that is now only an alias to @misc
today.
But I also see the concern about demands creeping in, and thanks to the great architectural structure of biblatex
, biblatex-software
is a "style extension" that can be added to mostly any existing style.
I think we can definitely live with this if there a clear statement in the biblatex
manual pointing to biblatex-software
for full fledged support of software citation: that would avoid seeing issues like this popping up over and over again over time, with potentially diverging implementations.
Do you think this is possible?
+1 for merging it. Software/apps are used pretty universal across fields and citing them in a proper way is important.
I think biblatex-software
is pretty heavy. It comes with three new entry types and a number of new fields. I have the feeling that for the average user the current @software
with its slightly more pedestrian approach suffices. I don't doubt that communities where software is cited (and discussed) heavily might have additional needs, but I'm not too sure if we really need to cater for everyone to that degree in the standard styles.
We have to balance the interests of completeness of the data model against simplicity of the standard styles, because the standard styles are supposed to be a basis for third-party styles. If we overload it with specific stuff that can make it harder for style authors to find their way round the code.
I feel that too many people get hung up on the alias thing (as in "@software
is only an alias for @misc
"). See also the lengthy discussions in https://github.com/plk/biblatex/issues/753 and linked posts. We have explicitly updated @software
in the documentation and made the aliasing more explicit in the code. In any case, @software
is valid for all type-specific options.
I agree with @moewew here - just because there is a comprehensive type or audience specific package, I don't think it belongs in biblatex
core. The modular approach is much cleaner.
As a compromise, would it be an option to integrate the main @software
from biblatex-software, and leaving the more specialized types @softwareversion
, @softwaremodule
and @codefragment
in a separate package. Ideally, there should be only one "software" type, and this should be mostly compatible with cff. In particular, fields like version
and repository
are rather important if you cite software (e.g. published here on github).
version
is already supported. We don't have a repository
field, but there is the generic url
, which will probably be enough for simple use cases.
I fully understand the desire of not growing the surface of biblatex
, and I am a great fan of its modular structure. It would be unfortunate, thouth, if the decision to not integrate biblatex-software
upstream would be taken based on the argument that citing software is just a specific need of a small academic community. We finally have objective data showing that the use, creation and sharing of software is widespread in all research fields, thanks to a monitoring effort put in place by the french ministry of research that you can access here https://frenchopensciencemonitor.esr.gouv.fr/software/general?id=general.utilisation (the disciplinary breakdown is here
https://frenchopensciencemonitor.esr.gouv.fr/software/fields?id=disciplines.utilisation)
I guess my argument is not that software citations are uncommon enough that we don't have to worry about them. We do have a @software
entry type, after all. My argument is that for many (most) use cases the status quo probably suffices. It's the fields and entry types of biblatex-software
that go beyond the biblatex
standard data model that seem to me to be of more niche interest.
I couldn't find a lot of bibliography/citation styles that have proper guidance for software citation, but taking APA style as an example, I believe we can already do what it wants (weirdly I couldn't find an example on the APA webpage, so I'm linking to a third-party interpretation, which I hope is accurate: https://libraryguides.vu.edu.au/apa-referencing/7DatasetsSoftwareTests).
I couldn't find a lot of bibliography/citation styles that have proper guidance for software citation, but taking APA style as an example, I believe we can already do what it wants (weirdly I couldn't find an example on the APA webpage, so I'm linking to a third-party interpretation, which I hope is accurate: https://libraryguides.vu.edu.au/apa-referencing/7DatasetsSoftwareTests).
There is also the Vancouver style https://www.nlm.nih.gov/bsd/uniform_requirements.html (§44).
I guess my argument is not that software citations are uncommon enough that we don't have to worry about them. We do have a
@software
entry type, after all. My argument is that for many (most) use cases the status quo probably suffices. It's the fields and entry types ofbiblatex-software
that go beyond thebiblatex
standard data model that seem to me to be of more niche interest.
Thanks for clarifying this :-)
I couldn't find a lot of bibliography/citation styles that have proper guidance for software citation, but taking APA style as an example, I believe we can already do what it wants (weirdly I couldn't find an example on the APA webpage, so I'm linking to a third-party interpretation, which I hope is accurate: https://libraryguides.vu.edu.au/apa-referencing/7DatasetsSoftwareTests).
Well, the point is that for a very long time software has not been considered a research output on par with publications, so we traditionally cited the documentation or the article describing the software, and not the software itself. In some cases one could see software assimilated to a book on a shelf, as it came in a box (see the software
entries for Zotero for example). The landscape has changed significantly with the growth of Open Source and the very recent raising awareness about the importance of valuing software output for the career of researchers and engineers in academia.
Only very recently the need to cite software directly, and not via proxies like articles or books, came out, so one does not find satisfactory guidelines for citing software in mainstream styles yet. There has been work to improve on the status quo, but it is either very generic, or plagued by a tendency to force software in some kind of "bed of Procustes" to cater to the need of publishers (that only want to see DOIs), or to mindsets that conflate software with data (which it is not).
This is why some four years ago we set up a software citation working group at Inria, bringing together a broad panel including top researchers that have developed and maintained a variety of significant research software for decades, to come up with a concrete proposal covering the needs of a large spectrum of software developments. The outcome has many facets: on one side, an article about software attribution and reference in CiSE 2020 (green open access here) that presents among other things a taxonomy of contributor roles which is important for software metadata; on the other side, the (much smaller) data model for software citation, that was eventually implemented in biblatex-software
as a style extension thanks to biblatex
's wonderful modular architecture.
We did try to keep the new fields to a minimum, but there are a few needed ones, and the various entries in biblatex-software
are there to make sure that the various forms of software projects can be accomodated. For example, the @softwaremodule
entry (that was a surprise to me) is needed to properly cite modules/plugins like the ones that are found in the Computational Geometry Algorithms Library (CGAL): they have been using the @book
and @incollection
entries for citing the library and its components (see their bibtex file here), and we want to make sure they can do the same using @software
and @softwaremodule
, using the proper crossref
mechanism for field inheritance, etc.
I believe that this may look like a niche need, but there is a tidal wave coming, for which we need to prepare.
Sorry for the long message, but I just realised that we never took the time to write down the story of how all this came up (there is a bit in the biblatex-software
documentation, but not enough), and I got carried away :-)
I believe that this may look like a niche need, but there is a tidal wave coming, for which we need to prepare.
So why not think of biblatex-software
as a dyke to protect us from that wave? I suggest we stick with what we have in biblatex
for now and let users rope in biblatex-software
if they need it.
I once read that with software features "no is temporary, yes is forever". If we add in all of the biblatex-software
data model now, we're stuck with it and have to maintain it even if it turns out only a small minority of people actually uses the advanced features. If we stick with the status quo, people with simple software citation needs (including APA and Vancouver, thanks for the reference @zepinglee) can use the standard data model. Those who need more can load biblatex-software
. Once it turns out that lots of people need the extended data model that biblatex-software
offers and we are flooded with requests for that, we can always reconsider adding it in.
Indeed, I developed biblatex-software
instead of opening an issue here precisely because it did not seem reasonable to propose a change to biblatex
before doing appropriate field testing. Now, after three years in CTAN, we are rather confident that we have all that we need, and I feel it would be ok to propose a merge upstream: it is easier for users to have the functionality out of the box instead of loading an extra package, and the merge process would probably allow to remove quite a bit of glue code.
But I am also perfectly fine with maintaining biblatex-software
separately for the moment.
May I suggest that somewhere in the documentation a pointer to biblatex-software
is added to help the users?
Or, if this goes against policy, that a point is made somewhere to make sure that if an evolution of the @software
entry happens, it stays compatible with what biblatex-software
does?
Recently, and in particular with the Github integration, the citation file format got popular for specifying metadata for software and/or code. According to twitter responses, Github is also considering to provide a similar functionality that uses a bibtex file to provide the same kind of metadata, and/or to export to bibtex if the metedata is provided in CFF. For these reasons, and to facilitate citing software in ones papers, it would be good in my opinion if biblatex's software type and CFF are defining compatible standards. This would make automatic translation between the different formats straightforward. Most metadata fields in CFF have a corresponding field in biblatex, but not all. For example, 'version', 'commit', 'license' and 'repository' are missing in biblatex, see https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#index for an overview of all fields.
Background: We at JabRef are currently faced with the issue of importing CFF into bib(la)tex, and are unsure how to treat the metadata information with no equivalent fields. See https://github.com/JabRef/jabref/pull/7946 for work in progress.
Maybe the maintainers @sdruskat, @hainesr and @jspaaks of CFF have further input. Refs https://github.com/citation-file-format/ruby-cff/issues/48 and https://github.blog/2021-08-19-enhanced-support-citations-github/