Open rowanc1 opened 2 years ago
Would be curious on your thoughts @chrisjsewell
See https://github.com/executablebooks/MyST-Parser/issues/511 😉
We still need info about kind
, num
, etc (i.e. the things you crossed out) on the cite group, right?
I had something like:
type CitationGroup = {
type: 'citationGroup';
kind: 'narrative' | 'parenthetical'; // 'citet' vs 'citep'
parentheses: boolean; // if false, 'citealt' and 'citealp' instead
mode: 'year' | 'numerical';
children: Citation[];
};
And even a single citation is a child of of a citation group in the AST?
(Also, I like citation
and citationGroup
since these are "citations" not "cites" - but... that's more verbose and doesn't match natbib)
For sure, I think citations should be a "first-class citzen" of MyST 👍
One think that I do think its worth thinking about, is do you actually need to restrict "citations" to just the conventional bibligraphy type references?
Essentially, the abstraction is just a key(s) that references an external resource (bibtex, json, yaml, ...) which contains a dictionary of key -> fields
, e.g.
key:
field1: content
field2: content
https://www.overleaf.com/learn/latex/Glossaries are also essentially the same abstraction as, to some extent, are https://myst-parser.readthedocs.io/en/latest/syntax/optional.html#substitutions-with-jinja2 (see also something I was playing around with https://github.com/chrisjsewell/sphinx-glossary/blob/main/docs/index.md)
Do you need different node types for all of these, or can it be "generalised"? Or at least share a parent interface
Nice, I like those additions to the group @fwkoch -- the reason I also had cite
is that is an HTML element (see mdn), so seemed like sticking close to html/latex here would be good. (not sure about the group name though, in Curvenote we also use this group to wrap crossReferences, for example, which can collapse (Figure 1 & 2)
while still having unique links to the content)
@chrisjsewell, I think that the citations are special/important enough to be their own mdast type, but maybe the syntax for creating them can be the same/extensible (which would be nice from a writing perspective). We are currently backing out our generalizations for citations in Curvenote at the moment after a few years: citations are special/weird enough to have their own dedicated type/apis/endpoints/etc. 🤷
Again, that is the mdast cite
type only (e.g. locator
isn't applicable to glossaries, or mode=year to abbreviations), I think the myst-syntax can be extensible though. 👍
Thanks for starting this discussion. I agree with everyone here that citations
are first class citizen
of any scientific document.
I think a lot of users will come from LaTeX and bibtex
background so some basic LaTeX similarities such as:
{cite}
role (as we have)style of references
that are printed in the reference
list to suit (i.e. Harvard etc.)style of references
in the text such as [1]
and Jones (2009)
this combination covers pretty much most of my use of references
from a LaTeX universe.
I also really like the flexibility being discussed here in adding more sophisticated references such as pages
, see
and chapter
references. I agree that natbib
is a good reference, and the concept of metadata
for roles
is an interesting idea. I wonder though if we aded an option extension syntax such as:
{cite}`jones1999 <<locator='chapter 2>>`
My other wish list item would be support for .bib
(bibtex) files as a source of data for the citations, as I know a lot of authors that have invested in bib
collections; in addition there are a lot of webpages that know provide copy and paste bibtex entries.
Also this must be a javascript
thing but don't fully see why we need both an object name and a type defined
type CiteGroup = {
type: 'citeGroup'
I guess you can't do the equivalent of isinstance()
as done in python?
Currently doing some investigation on citations and thought I would post it here as it would be great to get on the same page for the data-structures for citations in mdast (I think there is more thought probably on the myst-syntax, do we adopt
[@key]
pandoc style citations, etc.). I would love to be aiming for the same place for the mdast data structures as the other syntax conversations evolve.For a piece of technical content, the best practices for in-text citations are probably latex/natbib and pandoc citations which are defined here:
I am think the following mdast data-structures might capture everything:
I think this works pretty well and can fit with the
{cite:t}`jon22`
syntax we already have defined, but maybe in the future there is some way to give roles more data: For example:{cite:p}[prefix="see", locator="chap. 2"]`jon22`
would yield:(see Jones et al., 2022, chap. 2)
Or maybe there is a specialized way to do this with[see @jon22, chap. 2]
(see pandoc)For multiple citations, the
citeGroup
would never be a directive or be in the markup, (i.e.[@key1; @key2]
or{cite:p}`key1; key2`
), but I think that the AST data structure is better represented by multiple nodes, one holding the group (parenthetical) information, this also means UIs can open groups of citations in a list (e.g. see distill/elife as good examples of this UI).Both
cite
andciteGroup
would be flow content, so the equivalent of a "citet" in latex is just a cite node in a paragraph (@key1
in pandoc style).Some questions:
citeGroup
?should we follow(previously suggested a single cite node, splitting into group solves this).kind
or have some different flags likeparenthetical
? I suggestedkind
because that seemed easier to expand in the future if we addnum
oralt
etc.narrative
andparenthetical
nomenclature comes from hereExisting implementations:
Would be curious on your thoughts @chrisjsewell and @fwkoch (maybe @mmcky as well?)!