jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.84k stars 3.39k forks source link

org-reader fail to parse org-ref citations in used outside of emacs #8044

Open aminevsaziz opened 2 years ago

aminevsaziz commented 2 years ago

so the issue was that inserting org-ref style citation in org file and then process it outside of emac using pandoc. at first, the expected output should be citation processed using CSL or at least recognized. However, pandoc recognize the citation as links but if if i use org-ref-export/body function and use CSL option in render the citation as text only. see the test below this the test.org file

#+csl-style: elsevier-harvard.csl
#+PANDOC_OPTIONS: bibliography:zotlib.bib
#+BIBLIOGRAPHY: zotlib.bib

[[cite:&nam-2011]]

* Bibliography
bibliographystyle:unsrtnat
bibliography:zotlib.bib

this the zotlib.bib file

@inproceedings{nam-2011,
  title = {Smart City as Urban Innovation: Focusing on Management, Policy, and Context},
  shorttitle = {Smart City as Urban Innovation},
  booktitle = {Proceedings of the 5th {{International Conference}} on {{Theory}} and {{Practice}} of {{Electronic Governance}} - {{ICEGOV}} '11},
  author = {Nam, Taewoo and Pardo, Theresa A.},
  year = {2011},
  pages = {185},
  publisher = {{ACM Press}},
  address = {{Tallinn, Estonia}},
  doi = {10.1145/2072069.2072100},
  abstract = {This paper sees a smart city not as a status of how smart a city is but as a city's effort to make itself smart. The connotation of a smart city represents city innovation in management and policy as well as technology. Since the unique context of each city shapes the technological, organizational and policy aspects of that city, a smart city can be considered a contextualized interplay among technological innovation, managerial and organizational innovation, and policy innovation. However, only little research discusses innovation in management and policy while the literature of technology innovation is abundant. This paper aims to fill the research gap by building a comprehensive framework to view the smart city movement as innovation comprised of technology, management and policy. We also discuss inevitable risks from innovation, strategies to innovate while avoiding risks, and contexts underlying innovation and risks.},
  isbn = {978-1-4503-0746-8},
  langid = {english},
  file = {/home/julia/dotfiles/org/literature/nam_2011.pdf}
}

and this the command i used pandoc -f org -t native --citeproc --csl=elsevier-harvard.csl --bibliography=zotlib.bib --standalone test.org

the native ouput was [ RawBlock (Format "org") "#+csl-style: elsevier-harvard.csl" , RawBlock (Format "org") "#+PANDOC_OPTIONS: bibliography:zotlib.bib" , RawBlock (Format "org") "#+BIBLIOGRAPHY: zotlib.bib" , Para [ Link ( "" , [] , [] ) [ Str "cite:&nam-2011" ] ( "cite:&nam-2011" , "" ) ] , Header 1 ( "bibliography" , [] , [] ) [ Str "Bibliography" ] , Para [ Str "bibliographystyle:unsrtnat" , SoftBreak , Str "bibliography:zotlib.bib" ] ]

Hint i checked the inline reader parser for org-ref style citation in https://github.com/jgm/pandoc/blob/394fa9d0727a30f540d9c36ccfa68fc942cad587/src/Text/Pandoc/Readers/Org/Inlines.hs#L357 and it point out to "][" instead of "]]"

Pandoc version pandoc 2.17 and 2.18, on Archlinux

jgm commented 2 years ago
-- | Read a link-like org-ref style citation.  The citation includes pre and
-- post text.  However, multiple citations are not possible due to limitations
-- in the syntax.                     
linkLikeOrgRefCite :: PandocMonad m => OrgParser m (F Citation)
linkLikeOrgRefCite = try $ do
  _    <- string "[["
  mode <- orgRefCiteMode
  key  <- orgRefCiteKey
  _    <- string "]["
  pre  <- trimInlinesF . mconcat <$> manyTill inline (try $ string "::")
  spc  <- option False (True <$ spaceChar)
  suf  <- trimInlinesF . mconcat <$> manyTill inline (try $ string "]]")

Looks like we're only allowing link-style citations which have the structure [[cite:foo][...morestuff]], which may not be correct. I don't use this org feature, so I'll let @tarleb comment further.

aminevsaziz commented 2 years ago

That's good news. I hope @tarleb have more insights about such issue.

tarleb commented 2 years ago

My knowledge about many org features is out of date; I'll have to read up on that. Help, especially in form of links to relevant docs, would be welcome.

aminevsaziz commented 2 years ago

My knowledge about many org features is out of date; I'll have to read up on that. Help, especially in form of links to relevant docs, would be welcome.

That's good news ☺️ . But don't you think that adding support to [[cite:&key]] would be possible since: 1.org-cite is only [cite:@key].

  1. There already support to [[cite:key][something]].

Please check https://github.com/jkitchin/org-ref/blob/master/org-ref.org also @jkitchin maybe help, he is the author of the package.

jkitchin commented 2 years ago

You may want to set org-ref-cite-insert-version to 2, and just keep using the version 2 syntax that is supported. If not, it is easy enough to write a pre-processing hook to convert the version 3 syntax to the cite:key syntax.

aminevsaziz commented 2 years ago

You may want to set org-ref-cite-insert-version to 2, and just keep using the version 2 syntax that is supported. If not, it is easy enough to write a pre-processing hook to convert the version 3 syntax to the cite:key syntax.

Thank you for sharing. However, the limited knowledge of emacs makes it a bit hard to do it. If you can sharing some proposals or materials that help achieving this quest.

Update:

@jkitchin I tried v2 and the native output from pandoc (-t native) show that org-ref citation are recognized as links due to its format of [[]].