jgm / citeproc

CSL citation processing library in Haskell
BSD 2-Clause "Simplified" License
154 stars 17 forks source link

Custom Field in CSLJson silently dropped #137

Open AliaumeL opened 1 year ago

AliaumeL commented 1 year ago

Expected behavior

Given a valid CSLJson file such as the one below, the custom field custom-field should be parsed and included in the bibliographic entries.

 [
  {
    "id": "citation_id",
    "type": "manuscript",
    "custom": {
      "custom-field": "custom-value"
    }
  }
]

The issue came to my attention when using pandoc to create a (custom) list of bibliographic entries for a curriculum vitae, where extra fields represent additional material (slides, preprint, arxiv version). It is currently not possible, because extra fields are ignored by citeproc. It may be that keeping this information together with the bibliography and feeding it to pandoc is not the correct way to generate this list. Nevertheless, this behavior does not comply with the CSLJson specification.

Futher explanation

The CSLJson schema includes a specific field named custom to allow users to add extra information to bibliographic entries. This is specified on line 444 of the csl-data schema, where the following is written

     "custom": {
        "title": "Custom key-value pairs.",
        "type": "object",
        "description": "Used to store additional information that does not have a designated CSL JSON field. The custom field is preferred over the note field for storing custom data, particularly for storing key-value pairs, as the note field is used for user annotations in annotated bibliography styles.",
        "examples": [
          {
            "short_id": "xyz",
            "other-ids": ["alternative-id"]
          },
          {
            "metadata-double-checked": true
          }
        ]
      }

However, such fields are silently ignored by citeproc, because of line 902 of Types.hs, where the conversion from Json ignores "unknown" field types that are neither strings nor numbers. The issue can be traced back to the definition of allowed field types that does not include dictionaries.

Potential solution

I have no idea how to solve this issue, but it may be that documenting the fact that "custom" fields are ignored is enough to close it.

jgm commented 1 year ago

How are these custom fields supposed to be used? Is it possible to access their content in a CSL style? If not, what did you expect pandoc to do with them?