jgm / citeproc

CSL citation processing library in Haskell
BSD 2-Clause "Simplified" License
154 stars 16 forks source link

Add support for semantic markup in bibliographies #132

Open frederik-elwert opened 1 year ago

frederik-elwert commented 1 year ago

For some output formats, it is desirable to not only have a formatted bibliography, but to use semantic markup to identify parts of the bibliography (e.g., title, author, publisher, …). While pandoc supports this for JATS, it is lacking for other output formats like TEI (https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-bibl.html) or HTML (https://schema.org/CreativeWork).

The old pandoc-citeproc processor supported this via raw content elements as an extension to CSL, which allowed citation styles to specify bits of semantic markup to be added to the output. An example for TEI using the old processor can be found here: https://github.com/frederik-elwert/teicite.

It would be desirable if either citeproc re-implements the CSL extension, or if it (or pandoc) provides an alternative solution for adding semantic markup to bibliographies.

tarleb commented 1 year ago

Could it make sense to somehow preserve the names of macros by wrapping the contents in a span with the macro's name? That would make it easy to post-process the output with a filter.

jgm commented 1 year ago

Can't do this with the current API. At the least, we'd need new methods for CiteprocOutput class, which would create a labeled Span in pandoc output.

frederik-elwert commented 1 year ago

I’m not sure preserving the macro names would be the best thing to do, as the macro names are basically up to the style authors? I guess ideally the variable names themselves would be preserved for the different output elements like names, text etc.