modesty / pdf2json

converts binary PDF to JSON and text, for server-side PDF processing and command-line use.
https://github.com/modesty/pdf2json
Other
1.97k stars 376 forks source link

Is there a future path for pdf2json? #44

Open shobhitg opened 9 years ago

shobhitg commented 9 years ago

First of all this is a really great project, and there is none like it.

But I can't help but notice that the files copied from PDF.js are 2 years old and aging.

In last two years a bunch of work has been done @ PDF.js: https://github.com/mozilla/pdf.js/commits/master/src/core

If this project has to keep up, survive and flourish, there has to be a strategy to keep up to date.

I tried to do this myself, but failing horribly.

Is it possible to make use of the deliverable (combined file) in pdfjs-dist project: https://github.com/mozilla/pdfjs-dist/tree/master/build

Lets discuss ideas around this, even if we don't have sure shot solutions.

mwaschkowski commented 8 years ago

Modesty, are you still watching this project? I would like to know if you have any plans for this as well?

shobhitg commented 8 years ago

I am very positive that this project can be self-community driven, but only once we have a system of utilizing latest pdf.js. I think basing it off https://github.com/mozilla/pdfjs-dist is our best bet.

Someone familiar with pdf.js will have to help with this, because now there are too many browser specific pieces to deal with like web-workers, etc.

mwaschkowski commented 8 years ago

Hi Shobhit,

Modesty doesn't seem to be here right now, I posted this issue about a month ago:

https://github.com/modesty/pdf2json/issues/45

but no response.

Would you have any idea why the field positions are off, or where I could go in the code to fix it?

Thanks

Mark pdf2json discrepencies

shobhitg commented 8 years ago

@mwaschkowski Lets continue this tangential discussion in #45.

wanghaisheng commented 7 years ago

@shobhitg what have you got since then ?abandon this one or find a way to upgrade the whole pdf.js core into the latest version

rainabba commented 7 years ago

?