ropensci / qpdf

Split, Combine and Compress PDF files
https://docs.ropensci.org/qpdf
Other
57 stars 10 forks source link

Remove PDF pages while retaining the bookmarks #22

Open danielvartan opened 11 months ago

danielvartan commented 11 months ago

Hi,

Awesome package!

Is there a way to remove PDF pages using qpdf::pdf_subset() without losing all the bookmarks?

m-holger commented 5 months ago

Not the way the function cpp_pdf_split used by qpdf::pdf_subset is written. cpp_pdf_split copies the pages requested to a new empty pdf file, which looses all bookmarks, links, etc. To preserve bookmarks, cpp_pdf_split would need to remove any pages not requested from the input file. This would be a relatively trivial to implement.

m-holger commented 5 months ago

Sample implementation:

Rcpp::CharacterVector cpp_pdf_select(char const* infile, char const* outfile,
                                     Rcpp::IntegerVector which, char const* password){
  QPDF inpdf;
  read_pdf_with_password(infile, password, &inpdf);
  QPDFPageDocumentHelper in_pdh(inpdf);
  std::vector<QPDFPageObjectHelper> pages =  in_pdh.getAllPages();
  for (auto const& page :pages) {
    in_pdh.removePage(page);
  }
  for (int i = 0; i < which.size(); i++) {
    int index = which.at(i) -1; //zero index
    in_pdh.addPage(pages.at(index), false);
  }
  QPDFWriter outpdfw(inpdf, outfile);
  outpdfw.setStaticID(true); // for testing only
  outpdfw.setStreamDataMode(qpdf_s_preserve);
  outpdfw.write();
  return outfile;
}