kaushalmodi / ox-hugo

A carefully crafted Org exporter back-end for Hugo
https://ox-hugo.scripter.co
GNU General Public License v3.0
881 stars 133 forks source link

citations with pandoc/pandoc-citeproc #175

Closed mclearc closed 6 years ago

mclearc commented 6 years ago

Great package - thanks for all your work! I was wondering if there is any way to get ox-hugo to leverage pandoc/pandoc-citeproc so that pandoc's use of citations and citation keys could be utilized? Ideally, it might involve something like an option to call pandoc-citeproc to convert citation keys in the subtree to full citations in the markdown. If I knew any elisp I would help, but unfortunately....

kaushalmodi commented 6 years ago

Hello!

I have used pandoc, but I am not familiar with the citations stuff.

This is probably out of the scope of ox-hugo, but let's see if we can find a solution to this.

It's possible that there are some Elisp packages out there that do the pandoc citation parsing you want.. I would guess that ox-pandoc probably does this. So let's find out if that's the case.

If that's the case, I can probably add a hook in ox-hugo that you can customize to call that pandoc processing function.

I am also copying @kawabata, the author of ox-pandoc, to see if they can provide some insight here.

There are lots of "if"s and unknowns involved.. let's see where we get :)


Update: I am now working on this. See the action items here: https://github.com/kaushalmodi/ox-hugo/issues/175#issuecomment-405987796.

Update 2: WIP branch pandoc-citeproc-support

mclearc commented 6 years ago

Thanks - citations look like this [@citekey1; @citekey2]. ox-pandoc would certainly be able to parse the citations, so a simple hook would be perfect.

kaushalmodi commented 6 years ago

@mclearc Sorry, but it's not yet clear to me. I'll need a comprehensive example in order to implement this and test it. That example will eventually make into the ox-hugo test suite.

mclearc commented 6 years ago

Ok - so in order to convert citations you will need to specify a bibliography file (usually a latex .bib file) using the bibliography metadata field in a YAML metadata section, or --bibliography command line argument. ox-pandoc does this with #+BIBLIOGRAPHY: ~/path/to/bib.bib line in the document.

I assume that the easiest thing to do is use pandoc to generate a markdown file in which the citations have already been rendered, rather than trying to do it at the stage of mardown-->html. So there is the option PANDOC_EXTENSIONS: markdown-citations that allows this.

I've included an org file with these options and some other relevant ones, as well as a sample bib file. I've also included a version of what the ideal end result would look like. I'm sure I've probably left out something important so just let me know what you need. Thanks for the interest in working on this!

kaushalmodi commented 6 years ago

Thanks for those examples.

Unfortunately, supporting this seems to be a lot more involved. You can either use ox-pandoc or ox-hugo to generate the Markdown, but not both.

How the exporter works is that it passes each Org element through the user-specified exporter. So an Org bold element *foo* will become **foo** when passed through ox-hugo, ox-md, etc., but instead become <b>foo</b> when passed through ox-html exporter.

AFAIK, Org doesn't have a special element for syntax like @loncar2016. May be I just do search/replace of @something with the citation based on some internal function in ox-pandoc. But that sounds too risky, and I'd need to do some research on ox-pandoc + pandoc.

Also looks like you need support for #+pandoc_* options, so that adds yet another layer of complication.

I will leave this issue open hoping that someone using ox-pandoc + ox-hugo has a good idea on how to integrate this feature.

I have not used it myself, but have heard a lot about org-ref, though it is currently broken for markdown exporter backends (https://github.com/jkitchin/org-ref/issues/558).

kaushalmodi commented 6 years ago

For reference, here are the contents of that dropbox link:

Org

org-citation-example.org ```org #+TITLE: Org Pandoc Citation Test Example #+DATE: July 14, 2018 #+OPTIONS: author:nil #+BIBLIOGRAPHY: example.bib #+PANDOC_EXTENSIONS: markdown-citations #+PANDOC_OPTIONS: atx-headers:t #+NOCITE: @giovanelli2016; @eilan2016 * Test Header Here is a test example file with an in-text citation where someone important says something important (e.g. @loncar2016). And here is another bit of blah with a footnote citation.[fn:1] Note that the setting =PANDOC_OPTIONS= allows one to pass command line settings to pandoc via =ox-pandoc=. In this case, the setting passed converts citations into markdown. Note also the =NOCITE= comment in the header, which allows citations to appear in the references section that have not actually appeared in the body of the text. [fn:1] See [@thompson2016]. * References :PROPERTIES: :UNNUMBERED: :END: ```

Markdown

org-citation-example.md ```md +++ title = "Org Pandoc Citation Test Example" draft = false author = false toc = false +++ # Test Header Here is a test example file with an in-text citation where someone important says something important (e.g. Loncar (2016)). And here is another bit of blah with a footnote citation.[^1] Note that the setting `PANDOC_OPTIONS` allows one to pass command line settings to pandoc via `ox-pandoc`. In this case, the setting passed converts citations into markdown. Note also the `NOCITE` comment in the header, which allows citations to appear in the references section that have not actually appeared in the body of the text. # References {#references .unnumbered} Eilan, Naomi. 2016. "You Me and the World." *Analysis* 76 (3): 311--24. Giovanelli, Marco. 2016. "\"\...But I Still Can't Get Rid of a Sense of Artificiality\" the Reichenbach--Einstein Debate on the Geometrization of the Electromagnetic Field." *Studies in History and Philosophy of Science* 54: 35--51. Loncar, Samuel. 2016. "Why Listen to Philosophers? A Constructive Critique of Disciplinary Philosophy." *Metaphilosophy* 47 (1): 3--25. Thompson, Morgan, Toni Adleberg, Sam Sims, and Eddy Nahmias. 2016. "Why Do Women Leave Philosophy? Surveying Students at the Introductory Level." [^1]: See (Thompson et al. 2016). ```

Bib

example.bib ```bib @article{giovanelli2016, title = {"...{{But I}} Still Can't Get Rid of a Sense of Artificiality" {{The Reichenbach}}–{{Einstein}} Debate on the Geometrization of the Electromagnetic Field}, volume = {54}, journaltitle = {Studies in History and Philosophy of Science}, date = {2016}, pages = {35--51}, keywords = {Einstein,physics,science}, author = {Giovanelli, Marco}, file = {/Users/roambot/Dropbox/Work/MasterLib/giovanelli2016_the_reichenbach–einstein_debate.pdf} } @article{eilan2016, title = {You {{Me}} and the {{World}}}, volume = {76}, number = {3}, journaltitle = {Analysis}, date = {2016}, pages = {311--324}, author = {Eilan, Naomi}, file = {/Users/roambot/Dropbox/Work/MasterLib/eilan2016_you_me_and_the_world.pdf} } @article{loncar2016, title = {Why {{Listen}} to {{Philosophers}}? {{A Constructive Critique}} of {{Disciplinary Philosophy}}}, volume = {47}, number = {1}, journaltitle = {Metaphilosophy}, date = {2016}, pages = {3--25}, author = {Loncar, Samuel}, file = {/Users/roambot/Dropbox/Work/MasterLib/loncar2016_why_listen_to_philosophers.pdf} } @article{thompson2016, title = {Why {{Do Women Leave Philosophy}}? {{Surveying Students}} at the {{Introductory Level}}}, abstract = {Abstract Although recent research suggests that women are underrepresented in philosophy after initial philosophy courses, there have been relatively few empirical investigations into the factors that lead to this early drop-off in women's representation. In this paper, we.}, date = {2016}, author = {Thompson, Morgan and Adleberg, Toni and Sims, Sam and Nahmias, Eddy}, file = {/Users/roambot/Dropbox/Work/MasterLib/thompson2016_why_do_women_leave_philosophy.pdf} } @article{fricker2016, title = {What's the {{Point}} of {{Blame}}? {{A Paradigm Based Explanation}}: {{What}}'s the {{Point}} of {{Blame}}}, volume = {50}, number = {1}, journaltitle = {Noûs}, date = {2016}, pages = {165--183}, author = {Fricker, Miranda}, file = {/Users/roambot/Dropbox/Work/MasterLib/fricker2016_what's_the_point_of_blame.pdf} } ```
mclearc commented 6 years ago

Thanks for looking into this. I'll post any useful workarounds/workflows here if I think of them.

kaushalmodi commented 6 years ago

Thanks. Yesterday I spent some time tinkering with this. I thought I almost got it, and am faced by a frustrating outcome (Details on the issue I posted on Pandoc Google Groups).

kaushalmodi commented 6 years ago

@mclearc Question: How did you end up with that Markdown file in https://github.com/kaushalmodi/ox-hugo/issues/175#issuecomment-405721433?

Did you start with the Org file you posted, and end up with Markdown file through a pandoc command, or a particular ox-pandoc export? If so, what was that pandoc command or ox-pandoc export binding (C-c C-e p ? ?)?

Or did you first export and then manually tweak that Markdown file?

mclearc commented 6 years ago

I first exported then manually tweaked. It dawned on me that Hugo reads yaml front matter so I didn't even really need to produce a toml section. As for the command I used, it was just some version of org-pandoc-export-to-markdown. I think the trick is going to be retaining the footnote links while converting the citations. If I come up with a winning combo I'll let you know.

mclearc commented 6 years ago

Here's what straight exporting to markdown (i.e. org-pandoc-export-to-markdown) with the following options in the org file gets you:

Options:

#+OPTIONS: author:nil
#+BIBLIOGRAPHY: example.bib
#+PANDOC_EXTENSIONS: markdown-citations 
#+PANDOC_OPTIONS: atx-headers:t
#+NOCITE: @giovanelli2016; @eilan2016

This actually seems pretty close to what you were looking for..

---
bibliography: 'example.bib'
date: 'July 14, 2018'
nocite:
- '@giovanelli2016; @eilan2016'
pandoc_extensions: 'markdown-citations'
pandoc_options: 'atx-headers:t'
title: Org Pandoc Citation Test Example
---

# Test Header

Here is a test example file with an in-text citation where someone
important says something important (e.g. Loncar (2016)). And here is
another bit of blah with a footnote citation.[^1]

Note that the setting `PANDOC_OPTIONS` allows one to pass command line
settings to pandoc via `ox-pandoc`. In this case, the setting passed
converts citations into markdown.

Note also the `NOCITE` comment in the header, which allows citations to
appear in the references section that have not actually appeared in the
body of the text.

# References {#references .unnumbered}

::: {#refs .references}
::: {#ref-eilan2016}
Eilan, Naomi. 2016. "You Me and the World." *Analysis* 76 (3): 311--24.
:::

::: {#ref-giovanelli2016}
Giovanelli, Marco. 2016. "\"\...But I Still Can't Get Rid of a Sense of
Artificiality\" the Reichenbach--Einstein Debate on the Geometrization
of the Electromagnetic Field." *Studies in History and Philosophy of
Science* 54: 35--51.
:::

::: {#ref-loncar2016}
Loncar, Samuel. 2016. "Why Listen to Philosophers? A Constructive
Critique of Disciplinary Philosophy." *Metaphilosophy* 47 (1): 3--25.
:::

::: {#ref-thompson2016}
Thompson, Morgan, Toni Adleberg, Sam Sims, and Eddy Nahmias. 2016. "Why
Do Women Leave Philosophy? Surveying Students at the Introductory
Level."
:::
:::

[^1]: See (Thompson et al. 2016).
kaushalmodi commented 6 years ago

Great! markdown-citations was the key!

Try this:

#+hugo_base_dir: .

#+title: Org Pandoc Citation Test Example

#+export_file_name: org-citation-example-source

#+hugo_front_matter_format: yaml
#+hugo_custom_front_matter: :bibliography "example.bib"
#+hugo_custom_front_matter: :nocite "@giovanelli2016, @eilan2016"

* Test Header
Here is a test example file with an in-text citation where someone
important says something important (e.g. @loncar2016). And here is
another bit of blah with a footnote citation.[fn:1]
* References
* Footnotes
[fn:1] See [@thompson2016].

Instructions

Convert this Org file to Markdown

kaushalmodi commented 6 years ago

If this works, I need to work on these to make it work with C-c C-e H H in one step:

What you lose:

mclearc commented 6 years ago

Yeah - those steps work for me. Though you'd then need to presumably move the pandoc converted file to the right directory and delete the symlink? Here's the output:


---
author:
- Colin McLear
bibliography: 'example.bib'
draft: False
nocite: '@giovanelli2016, @eilan2016'
title: Org Pandoc Citation Test Example
---

## Test Header

Here is a test example file with an in-text citation where someone
important says something important (e.g. Loncar (2016)). And here is
another bit of blah with a footnote citation.[^1]

## References

## Footnotes {#footnotes .unnumbered}

::: {#refs .references}
::: {#ref-eilan2016}
Eilan, Naomi. 2016. "You Me and the World." *Analysis* 76 (3): 311--24.
:::

::: {#ref-giovanelli2016}
Giovanelli, Marco. 2016. "\"\...But I Still Can't Get Rid of a Sense of
Artificiality\" the Reichenbach--Einstein Debate on the Geometrization
of the Electromagnetic Field." *Studies in History and Philosophy of
Science* 54: 35--51.
:::

::: {#ref-loncar2016}
Loncar, Samuel. 2016. "Why Listen to Philosophers? A Constructive
Critique of Disciplinary Philosophy." *Metaphilosophy* 47 (1): 3--25.
:::

::: {#ref-thompson2016}
Thompson, Morgan, Toni Adleberg, Sam Sims, and Eddy Nahmias. 2016. "Why
Do Women Leave Philosophy? Surveying Students at the Introductory
Level."
:::
:::

[^1]: See (Thompson et al. 2016).
kaushalmodi commented 6 years ago

Though you'd then need to presumably move the pandoc converted file to the right directory and delete the symlink?

The copying/symlinking is needed only in the current "prototype" :) See my checklist in https://github.com/kaushalmodi/ox-hugo/issues/175#issuecomment-405987796.

mclearc commented 6 years ago

Ah - right :) I don't see the "losses" as at all a problem. Thanks again for your work! 🎉

kaushalmodi commented 6 years ago

After all the manual edits in that checklist, I get:

---
author:
- Kaushal Modi
bibliography: 'example.bib'
draft: False
nocite: '@giovanelli2016, @eilan2016'
title: Org Pandoc Citation Test Example
---

## Test Header

Here is a test example file with an in-text citation where someone
important says something important (e.g. Loncar (2016)). And here is
another bit of blah with a footnote citation.[^1]

## References {#references}

<a id="ref-eilan2016"></a>
Eilan, Naomi. 2016. "You Me and the World." *Analysis* 76 (3): 311--24.

<a id="ref-giovanelli2016"></a>
Giovanelli, Marco. 2016. "\"\...But I Still Can't Get Rid of a Sense of
Artificiality\" the Reichenbach--Einstein Debate on the Geometrization
of the Electromagnetic Field." *Studies in History and Philosophy of
Science* 54: 35--51.

<a id="ref-loncar2016"></a>
Loncar, Samuel. 2016. "Why Listen to Philosophers? A Constructive
Critique of Disciplinary Philosophy." *Metaphilosophy* 47 (1): 3--25.

<a id="ref-thompson2016"></a>
Thompson, Morgan, Toni Adleberg, Sam Sims, and Eddy Nahmias. 2016. "Why
Do Women Leave Philosophy? Surveying Students at the Introductory
Level."

[^1]: See (Thompson et al. 2016).

which is rendered by Hugo as:

image

mclearc commented 6 years ago

perfect!

kaushalmodi commented 6 years ago

Interesting! Pandoc removed only the redundant heading ID's.

Before pandoc run

...
## Test Header {#test-header}

Here is a test example file with an in-text citation where someone
important says something important (e.g. @loncar2016). And here is
another bit of blah with a footnote citation.[^fn:1]

See [Some section](#abc).

## Some section {#abc}

## References {#references}
...

After pandoc run

...
## Test Header

Here is a test example file with an in-text citation where someone
important says something important (e.g. Loncar (2016)). And here is
another bit of blah with a footnote citation.[^1]

See [Some section](#abc).

## Some section {#abc}

## References {#references .unnumbered}
...

Diff

1c1
< ## Test Header {#test-header}
---
> ## Test Header
4,5c4,5
< important says something important (e.g. @loncar2016). And here is
< another bit of blah with a footnote citation.[^fn:1]
---
> important says something important (e.g. Loncar (2016)). And here is
> another bit of blah with a footnote citation.[^1]
9d8
<
12,13c11
<
< ## References {#references}
---
> ## References {#references .unnumbered}

So the CUSTOM_ID's are unharmed!

kaushalmodi commented 6 years ago

@mclearc Just one note.. I see that your markdown output has ## Footnotes {#footnotes .unnumbered}. I wonder why that is because it is not seen in my ox-hugo exports. May be we can resolve that in a separate issue once this one is resolved.

mclearc commented 6 years ago

hmm – not sure if that is something to do with my export settings or not but seems like it must be if you're not seeing it. I'll try and test it later today.

kaushalmodi commented 6 years ago

not sure if that is something to do with my export settings

Probably not.. but you probably tweaked the org-footnote-section variable? Mine is set to the default value "Footnotes". If so, this is a non-issue. You just need to put your Org footnotes under a top-level heading named org-footnote-section.

mclearc commented 6 years ago

yeah - it is currently set to nil - I like to keep footnotes with their sections. So setting org-footnote-section to t probably avoids the issue.

kaushalmodi commented 6 years ago

@mclearc I thought this implementation would be solid (working on https://github.com/kaushalmodi/ox-hugo/tree/pandoc-citeproc-support).. but found a show stopper.. Pandoc is transforming the ox-hugo generated Hugo/Blackfriday compatible tables into something else.

So this feature has a big caveat: You cannot use Org tables in your posts with Pandoc citation parsing enabled.


I have updated my checklist in https://github.com/kaushalmodi/ox-hugo/issues/175#issuecomment-405987796.

kaushalmodi commented 6 years ago

@mclearc Do you want to try out ox-hugo from pandoc-citeproc-support branch?

I believe it is almost good to go except for the caveat:

You also don't need to set the front-matter format to YAML; keeping with TOML would work.

mclearc commented 6 years ago

Thanks for your work. I'll give a look and report back. I don't use tables very often, so the lack of conversion there won't be an issue for me.

kaushalmodi commented 6 years ago

@mclearc And the final known issue about tables is also now solved! :D

Please update that branch and stress-test it!

Example Org:

* Pandoc Citations                                         :pandoc:citations:
:PROPERTIES:
:EXPORT_HUGO_PANDOC_CITATIONS: t
:EXPORT_BIBLIOGRAPHY: bib/bib1.bib, bib/bib2.bib
:EXPORT_HUGO_CUSTOM_FRONT_MATTER: :nocite '(@giovanelli2016 @eilan2016)
:END:
** Citations Example (TOML)                                            :toml:
:PROPERTIES:
:EXPORT_FILE_NAME: citations-example-toml
:END:
#+begin_description
Test the parsing of Pandoc Citations, while also testing that ox-hugo
exported Markdown doesn't get broken -- TOML front-matter.
#+end_description
*** Section 1
Here is a test example file with an in-text citation where someone
important says something important (e.g. @loncar2016). And here is
another bit of blah with a footnote citation.[fn:5]

See [[#citation-example-section-2]].
*** Section 2
:PROPERTIES:
:CUSTOM_ID: citation-example-section-2
:END:
Content in section 2.
*** Testing tables
|----------+----------+----------|
| Header 1 | Header 2 | Header 3 |
|----------+----------+----------|
| a        | b        | c        |
| d        | e        | f        |
|----------+----------+----------|

* Footnotes
[fn:5] See [@thompson2016].

With the .bib files in place, all you should need to do is C-c C-e H H.

mclearc commented 6 years ago

From what I can see this works great! The only issue I've seen is if a page happens not to have citations, but has :EXPORT_HUGO_PANDOC_CITATIONS: t the entire set of metadata dissapears. Probably won't often be an issue, but if you happen to have subtree without citations it might be a little weird. But thanks for the work! 🎊

kaushalmodi commented 6 years ago

The only issue I've seen is if a page happens not to have citations, but has :EXPORT_HUGO_PANDOC_CITATIONS: t the entire set of metadata dissapears.

Interesting! I happen to have had thought of that exact use case few minutes back and already fixed it :D Do you want to update that branch and retry?

mclearc commented 6 years ago

Looks fixed! 😄

kaushalmodi commented 6 years ago

This issue got auto-closed. Feel free to re-open it if you face any issues.

The pandoc test branch has now been merged into master. So you will get this update via the next Melpa update too.

kaushalmodi commented 6 years ago

Pandoc Citations - ox-hugo Documentation

mclearc commented 6 years ago

Looks great thanks so much for your work!