rsc / pdf

PDF reader
BSD 3-Clause "New" or "Revised" License
510 stars 327 forks source link

Support DCTDecodeFilter, which is no-op, actually #15

Open fawick opened 7 years ago

fawick commented 7 years ago

DCTDecode means that the blob is a raw jpeg which can be read right away.

cf. https://blog.idrsolutions.com/2011/07/extract-raw-jpeg-images-from-a-pdf-file/

mpl commented 7 years ago

I'd suggest adding a tiny PDF in a testdata dir, with such an embedded image, plus a decoding test, but I see the lib does not have any tests at all, so maybe that's not wanted...

mpl commented 7 years ago

well, @bradfitz suggested a test too, so maybe that change can also be the one that introduces the first test.

mpl commented 7 years ago

@josharian can we humbly ping you to have a look at this PR please?

josharian commented 7 years ago

Tests are definitely welcome. I looked at the change, and it seems fine to me, but I really know ~0 about PDFs, and I'm not at a point now where I want to learn enough to take ownership of this repo. Sorry.

fawick commented 7 years ago

Okay, I will supply a test tonight.

@bradfitz, @mpl Given all the open PRs, is it worth considering to fork and maintain this package under go4.org?

mpl commented 7 years ago

@fawick to answer your question about forking: given that nobody seems to be willing to own that repo, then yes, it probably will come to this. I doubt it'll be on go4.org though. As far as Camlistore is concerned, simply vendoring github.com/fawick/pdf (which is in effect similar to what you were proposing initially) is probably the way to go.

Let's finish the review on here first though.