sergiocorreia / panflute

An Pythonic alternative to John MacFarlane's pandocfilters, with extra helper functions
http://scorreia.com/software/panflute/
BSD 3-Clause "New" or "Revised" License
493 stars 60 forks source link

Can't convert markdown style citations to LaTeX style using panflute filter #222

Closed amine-aboufirass closed 1 year ago

amine-aboufirass commented 2 years ago

I have the following files in the folder root:

root
    template.latex
    test.md
    write.py

The contents of template.latex are as follows:

\documentclass[a4paper, $fontsize$]{article}

\usepackage{cite}

\begin{document}

    $body$

    \clearpage
    \bibliography{bibliography-test}
    \bibliographystyle{abbrv}

\end{document}

The contents of test.md are as follows:

Referencing @an-online-resource and @another-online-resource.

The contents of write.py are as follows:

import panflute as pf
import os
from pathlib import Path

def action(elem, doc):
    if isinstance(elem, pf.elements.Citation):
        text = f"\\cite{{{elem.id}}}"
        return pf.Citation(text)

def main(doc=None):
    return pf.run_filter(action, doc = doc)

if __name__ == "__main__":
    main()

As you can see I am trying to convert the individual citations in Markdown to LaTeX style citations. For instance @online-resource would be converted to \cite{online-resource}. So my expected result is as follows

\documentclass[a4paper, $fontsize$]{article}

\usepackage{cite}

\begin{document}

    Referencing \cite{online-resource} and \cite{another-online-resource}. 

    \clearpage
    \bibliography{bibliography-test}
    \bibliographystyle{abbrv}

\end{document}

When I run pandoc -F write.py --template template.latex -t plain -o test.tex test.md

I get the following in test.tex

\documentclass[a4paper, ]{article}

\usepackage{cite}

\begin{document}

    Referencing @an-online-resource and @another-online-resource.

    \clearpage
    \bibliography{bibliography-test}
    \bibliographystyle{abbrv}

\end{document}

So the citations are not converted as I expect, and I'm not sure why. I tried various combinations of returning the text in different API objects such as pf.Str(text), pf.Inline(text) or pf.Citation(pf.Str(text)) but none of those combinations work.

Any help or pointers would be greatly appreciated.

tarleb commented 2 years ago

You don't need a filter for this. Call pandoc with --biblatex or --natbib.

amine-aboufirass commented 2 years ago

I'm not really a fan of the default implementation and would like to make my own. Part of pandoc's power is that it allows for extending/replacing functionality with one's own readers/writers/filters.

tarleb commented 2 years ago

Interesting. Can you say more about the perceived shortcomings of the default implementation?

amine-aboufirass commented 1 year ago

@tarleb I don't think it has shortcomings, I just like the freedom of being able to build my own converters...

ickc commented 1 year ago

The problem with your code is from return pf.Citation(text). If you want to cook your own LaTeX, you shouldn't be returning a Citation. Try using raw LaTeX block for example.

Another problem is pandoc -F write.py --template template.latex -t plain -o test.tex test.md, it should be -t latex instead.

It is ok to make your own converters, but frankly citeproc and its friends are high usage tools where it is certainly not perfect, but has been battle tested. It would be easier to first study what that tool has to offer first (the pandoc manual has documented this, but it is dense, so make sure you understand all aspects of it.) Then raising questions or pointing out limitations about it would be beneficial even if you want to roll your own. You can start with pandoc-discuss for that purpose.