ad-freiburg / pdfact

A basic tool that extracts the structure from the PDF files of scientific articles.
Apache License 2.0
68 stars 11 forks source link

Optionally add page break info to output #6

Closed hannahbast closed 3 years ago

hannahbast commented 3 years ago

pdftotext has a ^L (\x0c = form feed) in its TXT output at each page break.

It would be good (and easy) to have an option that also adds this to the TXT output.

I am aware that one get that info from the XML or JSON output. But it's convenient if one can also get it in the TXT output.

But it should be optionally.

ckorzen commented 3 years ago

Implemented in 061ee11.