Open mfenner opened 3 years ago
For a more in-depth discussion of popular citation styles that make sense here I would ping @adam3smith, @zuphilip, or @AbeJellinek.
I think the list above in general is good and captures the most important styles and types of styles, though I don't much care for "Harvard" which is just a label of any author-date style. Maybe go with "Harvard - Cite them Right", the most commonly used such style in the UK (and one of the most downloaded styles at Zotero when they last provided some data on this). Since this already gives you a number of author-date styles, I'd use the Chicago (fullnote) version, not the author-date one.
Hi @mfenner and all. I think it would be great to add more citation styles - and the more we can do with existing gems the better.
I wonder if we should add these sorts of things in such a way that they can be optional. The reason for this is that (in my conversations with GitHub folks so far) we'd like to keep the number of extra dependencies down as low as possible. GitHub is already fairly complex I would imagine! (That said, I have just managed to reduce the current number of dependencies of ruby-cff
by one, so maybe we can add one in return.) I note that adding citeproc-ruby
would add four new dependencies in total.
So I think this would be a good conversation to have with @arfon when he's back from leave.
Any help with the development of this tool would be appreciated!
For me citationstyles.org and the various processors such as citeproc-ruby
are the community default (e.g. used by many reference managers) and I wouldn't want to try to reproduce them with custom code. It is also stable and well-maintained code.
Happy to hear @arfon on this, and I can write a pull request until next Monday to more clearly see how this would change the code.
Thanks @adam3smith.
For me citationstyles.org and the various processors such as citeproc-ruby are the community default (e.g. used by many reference managers) and I wouldn't want to try to reproduce them with custom code. It is also stable and well-maintained code.
Yes, I absolutely agree. I have an idea of how to reduce the current gem dependencies of ruby-cff
further, so hopefully that will also help.
If we support a fixed list of citation styles, we can specifically import them instead of importing all 1000s of styles as a submodule.
Yes. I also had a thought about generalizing the supported styles within ruby-cff
and making them pluggable as well. The current 2 (BibTeX and APA-like) were added rather at speed to show the concept, and get it in GitHub rapidly, rather than in the finished way that I would usually go for.
I suggest to support more citation styles, I think Crossref and DataCite (search.crossref.org and search.datacite.org) have a reasonable list of common citation styles plus bibtex (and RIS) that can be displayed in the UI without too much extra effort:
I think the current design will probably support 2-3 more without having to do some kind of reworking of the UI, so I'd encourage us to initially keep this list shorter initially rather than introducing a new dependency on the GitHub design team.
That said, I agree ultimately we should be trying to use the existing libraries out there for CSL logic (e.g., citeproc-ruby
).
If we support a fixed list of citation styles, we can specifically import them instead of importing all 1000s of styles as a submodule.
I'm assuming this would still introduce a new gem dependency here? As @hainesr alluded to, adding new dependencies to GitHub core is taken pretty seriously, and takes time for things such as security reviews.
Thanks @arfon. Two additional styles should work with the current UI (three with small adjustments to the tab width), and I would suggest to add these two:
They are both popular, cover different style classes (numeric and author-date, respectively), and either are used mainly in engineering or are generic (APA comes from the psychology field).
I have made good progress with my PR, and I can add the two styles directly, so no need to use the csl-styles
gem (which packages all styles into a Ruby gem), but adding citeproc-ruby
as a dependency. I don't think you can do formatted citations without CSL or Citeproc, as there are many years of work you take advantage of, including painful things such as how to display author names (a surprisingly complex topic) or rich text such as italic or superscript in titles. Supporting IEEE and Harvard similar to the current APA implementation is certainly more work than using citeproc-ruby
. We can of course ask the citeproc-ruby
author @inukshuk what he thinks regarding dependencies and potential security issues.
citeproc-ruby
is the default implementation in Ruby, and I think extracting the core functionality into ruby-cff
would create other issues. including long-term maintainability. But the ultimate decision, including the timing is of course up to you. This page lists the open source and commercial applications using Citation Style Language (most of them not using the Ruby CSL processor), including popular reference managers Zotero, Mendeley, Papers and ReadCube.
I'd be more than happy to help land support for CSL via citeproc-ruby
. Currently the implementation is spread across four Gems: citeproc, citeproc-ruby, csl, and namae. The latter is used for name parsing and could be made optional if the names in CFF are already tokenized sufficiently.
Thank you @inukshuk. I am working on a pull request for a first citeproc-ruby
implementation, using the "standard" approach. To show where I am going, I can post a WIP
version no later than tomorrow morning. CFF has nice name tokenization.
Now that I have Citeproc/CSL working locally, I noticed a few issues with the built-in APA formatting, thanks to the nice test coverage. I opened a separate issue at https://github.com/citation-file-format/ruby-cff/issues/66.
I have a pull request that addresses what is discussed in this issue. More cleanup and testing is needed, but the basic functionality of supporting three popular citation styles via citeproc-ruby
is working.
@hainesr @arfon if this goes in the right direction, I can polish this in the next few days. Let me know whether this should be an optional dependency or become the new default, e.g. to address #66.
@inukshuk almost everything interesting regarding citeproc-ruby
happens at https://github.com/citation-file-format/ruby-cff/pull/67/files#diff-8f8e86f9c0d66b48d62cc552013cba786cff8ff8bca22183a4044cfed316066c
I'd be more than happy to help land support for CSL via
citeproc-ruby
. Currently the implementation is spread across four Gems: citeproc, citeproc-ruby, csl, and namae. The latter is used for name parsing and could be made optional if the names in CFF are already tokenized sufficiently.
CFF supports person names with family-names
, given-names
, name-particle
and name-suffix
. Entities just have a name
. Guess this would be sufficient for name parsing?
This sounds good, and there is more work to do on my side in mapping CFF to Citeproc. Currently my mapping in ruby-cff
only supports family-names
, given-names
and name
(for organizations/entities).
@sdruskat yes, at this granularity there will be no need for name parsing and making namae
optional would have no adverse impact.
@mfenner the 'processor' interface is intended mainly for managing citations (e.g., creating cites in specific orders, tracking stuff like 'ibid' and similar details) and later generating references of all cited works; if I understand this correctly, we're going to be interested only in generating one-off reference strings for a given citation data. In this case it will be best to use the 'renderer' interface directly, similar to how it's done in jekyll-scholar for example.
For performance reasons it will almost certainly be desirable to parse the styles only once, especially if there are only a handful of vetted styles which are going to be used. It also might be useful to reuse the renderer instance, though that should have less of an impact than parsing the CSL styles.
Thanks @inukshuk. What is the general direction you want to go, optimize the use of citeproc-ruby
in ruby-cff
, and/or make changes to citeproc-ruby
code?
I'm happy to make changes to citeproc-ruby
in order to make it easier to adopt into ruby-cff
. However, the issues I've raised above would have to be addressed either in ruby-cff
or even further out, by apps using ruby-cff
. As a library, I believe that the best solution for ruby-cff
would be something like this:
citeproc-ruby
and don't include any CSL styles by defaultThis way, ruby-cff
has minimal dependencies and users have a maximum flexibility. For example, I could install ruby-cff
, citeproc-ruby
, and csl-styles
and then just format references using any of the official CSL styles by name, without worrying too much about fetching or updating individual styles.
Applications like GitHub, using ruby-cff
, would likely want to make their own decisions for security and performance reasons. For example, you will probably want to use a limited set of styles and locales. You would not want to parse styles and locales for every reference you generate but just once (both styles and locales should be thread-safe; in any case you'd likely want to cache them somehow instead of parsing XML every time).
We could also attempt to make these decisions in ruby-cff
but I don't think a library is generally the right place to do this. I mean, the citeproc formatter could, by default, cache styles and locales and re-use a single renderer instance (or a thread-local one; some features of the renderer are stateful, although I believe it should be possible to make it thread-safe if you're only rendering single references and don't need to keep track of sort order, suppressed successive author names and the like). But I believe an app such as GitHub will want full control, e.g., when exactly to parse a style or locale, or even load marshaled instances instead of parsing them again, and I think a library such as ruby-cff
should be able to facilitate that instead of making its own assumptions.
I suggest to support more citation styles, I think Crossref and DataCite (search.crossref.org and search.datacite.org) have a reasonable list of common citation styles plus bibtex (and RIS) that can be displayed in the UI without too much extra effort:
For this work I would use the
citeproc-ruby
gem and citationstyles.org citation style files. I am happy to do a pull request if that is the directionruby-cff
wants to go.