in the above command the /path/to/pdf_directory is a folder with pdfs inside it, the /figure/image/output/prefix is whatever gets added to the fig1-1.png figures that follow a system like fig{page}-{number} for their identification.
the /figure/data/output/prefix is the prefix added to the json files that tell us about the figure or table that was extracted from the pdf
this issue is a precursor to the wiki page (which I will write later) information for this method
steps
singularity build pdffigures2.sif docker://ghcr.io/devinbayly/pdffigures2
singularity shell pdffigures2.sif
sbt
program which was installed we need to source this installed filesource "/root/.sdkman/bin/sdkman-init.sh"
git clone https://github.com/allenai/pdffigures2.git
cd pdffigures2
in the above command the
/path/to/pdf_directory
is a folder with pdfs inside it, the/figure/image/output/prefix
is whatever gets added to thefig1-1.png
figures that follow a system likefig{page}-{number}
for their identification.the
/figure/data/output/prefix
is the prefix added to the json files that tell us about the figure or table that was extracted from the pdf