jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.21k stars 3.36k forks source link

Add YAML bibliography writer #9617

Open hseg opened 6 months ago

hseg commented 6 months ago

Describe your proposed improvement and the problem it solves.

Given that pandoc expects in-frontmatter bibliography in YAML, it would make sense to have a cslyaml writer in addition to the csljson one

Describe alternatives you've considered.

Given the suggestion in #8801, I've looked around for a YAML CSL style, but haven't found anything. Of course, there's always my current workaround of

pandoc -f bibtex -t csljson \
  | yq -y 'walk(if type=="object" then with_entries(.key|=ascii_downcase) else . end) | map(pick(.id, .[keys[]]))'

but I'd prefer not to have to do this kind of postprocessing.

jgm commented 6 months ago

pandoc -f bibtex -s -t markdown should do the job!

hseg commented 6 months ago

That it does! Mod the irrelevant annoyance that it also generates a nocite key and surrounding lines, but that can be coped with. This could stand to be documented more clearly, but other than that this can be closed

jgm commented 6 months ago

One issue is that, as far as I know, CSL YAML isn't specified anywhere.

-t markdown will put the content in Markdown, and that probably isn't wanted. But I don't know if it's stated anywhere what the format should be in CSL YAML? If it should be exactly as in CSL JSON, then perhaps a new writer is called for.

jgm commented 6 months ago

@bdarcus may be able to offer some illumination on the status of "CSL YAML."

bdarcus commented 6 months ago

@bdarcus may be able to offer some illumination on the status of "CSL YAML."

Unfortunately, its status is still really uncertain, along with everything else in the CSL 1.1 branch (this thread covers the questions we discussed in 2022 for those curious).

But I can say this:

These days, one can use JSON Schemas to validate YAML files.

So the schema there is designed for both:

https://github.com/citation-style-language/schema/blob/v1.1/schemas/input/csl-data.json

It's just we have no process by which we actually approve and release that and the other changes. So in practice, CSL is frozen, and along with it CSL YAML/JSON.

One could use the current JSON Schema for validating YAML, but it's less than ideal I think.

PS - In my experimental radical rewrite of CSL, I also rewrote the input model; it borrows a few ideas from the 1.1 version (YAML example).

hseg commented 6 months ago

One issue is that, as far as I know, CSL YAML isn't specified anywhere.

Indeed, that's been a point of confusion for me, and one reason I've stuck to pretty-printing the CSL JSON.

BTW, isn't -f bibtex -st markdown supposed to create a YAML metadata block? In particular, isn't the last line supposed to be ..., not ---?

jgm commented 6 months ago

Somewhere along the way, I got convinced to use --- instead of ... for the final delimiter for YAML metadata -- I think the reason was increased compatibility with other md implementations, but I can't remember.

jgm commented 6 months ago

It would be fairly trivial to modify the existing CslJson writer and reader to add readCslYaml and writeCslYaml (where these would just use YAML instead of JSON, everything else being the same). IF there is actually a demand for this...

bpj commented 6 months ago

@bdarcus I even write my JSON schemas in YAML. I use a Perl script using YAML/JSON/JSON Schema modules available on CPAN to load both data and schemas and then validate. In fact it is possible to validate any data representable as JSON once you have it loaded into a Perl data structure!

bdarcus commented 6 months ago

@bdarcus I even write my JSON schemas in YAML.

I started to do that with an experiment in that branch.

https://github.com/citation-style-language/schema/blob/v1.1/schemas/input/csl-rich-text.yaml

Definitely nicer.

But the CSLN schemas are generated from the Rust models, and I use serde to read the JSON and YAML.