modesty / pdf2json

converts binary PDF to JSON and text, for server-side PDF processing and command-line use.
https://github.com/modesty/pdf2json
Other
2.01k stars 377 forks source link

Boxsets stays empty #69

Open F4Ke opened 8 years ago

F4Ke commented 8 years ago

Hi,

I tried to use pdf2json with three different pdfs containing links to other websites.

But when I try, the boxsets returns empty.

This is my code :

var pdfParser = new PDFParser();

  pdfParser.on("pdfParser_dataError", errData => console.error(errData.parserError) );
  pdfParser.on("pdfParser_dataReady", pdfData => {
      for (var i = 0; i < pdfData.formImage.Pages.length; i++){
        console.log(pdfData.formImage.Pages[i].Boxsets) // why empty? Boxsets??
    }
  });

  pdfParser.loadPDF(pdf_path);

[http://www74.zippyshare.com/d/MzUNluNF/7310663/test1.pdf](this is my pdf test : http://www74.zippyshare.com/d/MzUNluNF/7310663/test1.pdf)

when I try to show pdfData.formImage.Pages[i].Boxsets it stays always empty

This is what i get :

{"Height":52.618,"HLines":[{"x":3.543,"y":10.757,"w":0.814,"l":1.529}],"VLines":[],"Fills":[{"x":0,"y":0,"w":0,"h":0,"clr":1},{"x":0,"y":-0.056,"w":37.25,"h":52.687,"clr":1}],"Texts":[{"x":3.313,"y":6.681,"w":17.597,"sw":null,"clr":0,"A":"left","R":[{"T":"TOTOTOTOTOTOOTOTOTOTOTOTOT","S":4,"TS":[0,14,0,0]}]},{"x":3.313,"y":9.931,"w":2.223,"sw":null,"clr":0,"A":"left","R":[{"T":"toto2","S":4,"TS":[0,14,0,0]}]}],"Fields":[],"Boxsets":[]} any idea why?

modesty commented 8 years ago

Boxsets are for radio buttons and checkboxes. If your PDF has them, could you upload it?