Make Zotero effective for us LaTeX holdouts
How can I generate a bib file of only the items referenced in my document? #2164

zackbatist commented 2 years ago

I have a file that references my zotero library via citekeys generated by betterbibtex. I now have to go through and update the references' metadata, and it would be much easier if I could collate them all into a collection automatically. I just found out that Reference Extractor ( does this pretty well for word and libreoffice documents, but this will not work for documents that have been converted to word or libreoffice using pandoc since they do not have an inherent connection with zotero.

How can I generate a bib file of only the items referenced in my document?

I inquired about this on the zotero forums, and I was encouraged to ask about this here.

retorquere commented 2 years ago

The aux scanner does just this:

The aux scanner will create a collection with the items you cited, and you can export that to a bib file.

zackbatist commented 2 years ago

Thank you! I am working with latex only indirectly and do not have an aux file automatically generated. Do I have to create it manually, or is there a way to generate this automatically too? I can not select a markdown file by right-clicking a collection and selecting "scan bibtext aux/markdown file for references".

retorquere commented 2 years ago

In what sense are you working with it indirectly?

I wouldn't know how to create one manually. The only workflow I know generates it automatically.

zackbatist commented 2 years ago

I'm writing in markdown using vscode, and have some latex in the yaml as preamble. When I run pandoc to convert markdown to pdf it passes the latex commands and incorporates them in the conversion. I tried converting md to tex using pandoc and then compiling to generate the aux, the compile generated the pdf as I expected but the aux does not look at all like the template on the aux-scanner support page. I tried compiling using overleaf too with the same result. Pasting the aux file below.

\@writefile{toc}{\contentsline {subsection}{\numberline {0.1}Overview and scope}{1}{subsection.0.1}\protected@file@percent }
\newlabel{overview-and-scope}{{0.1}{1}{Overview and scope}{subsection.0.1}{}}
\@writefile{toc}{\contentsline {subsection}{\numberline {0.2}Archaeological data}{3}{subsection.0.2}\protected@file@percent }
\newlabel{archaeological-data}{{0.2}{3}{Archaeological data}{subsection.0.2}{}}
\@writefile{toc}{\contentsline {subsection}{\numberline {0.3}Situating data in open science}{6}{subsection.0.3}\protected@file@percent }
\newlabel{situating-data-in-open-science}{{0.3}{6}{Situating data in open science}{subsection.0.3}{}}
\@writefile{toc}{\contentsline {subsection}{\numberline {0.4}Digital archaeology and data discourse}{9}{subsection.0.4}\protected@file@percent }
\newlabel{digital-archaeology-and-data-discourse}{{0.4}{9}{Digital archaeology and data discourse}{subsection.0.4}{}}
\@writefile{toc}{\contentsline {subsection}{\numberline {0.5}Pragmatist perspectives on archaeological data work}{12}{subsection.0.5}\protected@file@percent }
\newlabel{pragmatist-perspectives-on-archaeological-data-work}{{0.5}{12}{Pragmatist perspectives on archaeological data work}{subsection.0.5}{}}
\@writefile{toc}{\contentsline {subsection}{Bibliography}{15}{section*.2}\protected@file@percent }
\gdef \@abspage@last{22}

Here's the relevant parts of my yaml front matter that gets passed into pandoc:

id: Friuu8sUgvN65JGr2YCDl
title: Chapter 2 - Background [DRAFT]
desc: ''
updated: 1653683238547
created: 1645129278017
author: Zack Batist
date: 'Friday May 27, 2022'
  - /Users/zackbatist/Dropbox/zotero/zack.bib
  - /Users/zackbatist/Dropbox/PhDThesis/Writing/chicago-author-date.csl
geometry: margin=1in
toc: true
toc-depth: 6
numbersections: TRUE
secnumdepth: 6
fontsize: 12pt
shift-heading-level-by: -1
header-includes: |
  \input{/Users/zackbatist/Dropbox/PhDThesis/Writing/references.tex} <-- not bibliographic references, primary sources from interview data that requires special formatting
zackbatist commented 2 years ago

I added the --biblatex flag to my pandoc md to tex conversion prior to compiling the pdf and I got something closer to the desired aux format, which was much easier to manually edit using find/replace in my text editor. So consider this resolved in my own specific case.

retorquere commented 2 years ago

I'd prefer it if no manual editing were necessary. The original aux you posted didn't seem to have citekeys? And what manual changes did you need to make to the other one?

zackbatist commented 2 years ago

Export is pasted below. The list of citations is punctuated by the sections in which they appear. Also worth mentioning that the md to tex conversion with biblatex flag enabled made citations look like \autocite[289]{huggett2022} instead of the normal \cite. Excluding --biblatex but including --citeproc preformats the citations, so it appears in the tex file as Huggett (2022) according to the citation style.

\@writefile{toc}{\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax }
\@writefile{lof}{\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax }
\@writefile{lot}{\boolfalse {citerequest}\boolfalse {citetracker}\boolfalse {pagetracker}\boolfalse {backtracker}\relax }
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {0.1}Overview and scope}{1}{subsection.0.1}\protected@file@percent }
\newlabel{overview-and-scope}{{0.1}{1}{Overview and scope}{subsection.0.1}{}}
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {0.2}Archaeological data}{4}{subsection.0.2}\protected@file@percent }
\newlabel{archaeological-data}{{0.2}{4}{Archaeological data}{subsection.0.2}{}}
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {0.3}Situating data in open science}{9}{subsection.0.3}\protected@file@percent }
\newlabel{situating-data-in-open-science}{{0.3}{9}{Situating data in open science}{subsection.0.3}{}}
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {0.4}Digital archaeology and data discourse}{14}{subsection.0.4}\protected@file@percent }
\newlabel{digital-archaeology-and-data-discourse}{{0.4}{14}{Digital archaeology and data discourse}{subsection.0.4}{}}
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {0.5}Pragmatist perspectives on archaeological data work}{19}{subsection.0.5}\protected@file@percent }
\newlabel{pragmatist-perspectives-on-archaeological-data-work}{{0.5}{19}{Pragmatist perspectives on archaeological data work}{subsection.0.5}{}}
\@writefile{toc}{\defcounter {refsection}{0}\relax }\@writefile{toc}{\contentsline {subsection}{\numberline {0.6}Bibliography}{24}{subsection.0.6}\protected@file@percent }
\gdef \@abspage@last{24}
retorquere commented 2 years ago

A build will drop here shortly that imports the aux file from, but in the aux file from I don't see any citekeys.

retorquere commented 2 years ago

I'd love to know whether this fixed your problem.

zackbatist commented 2 years ago

Yes it did, thanks!