Closed fawkesley closed 11 years ago
Think you need to add pdf tables to the test requirements file, assuming it's on pypi.
Sorry you might need to rebase since I merged #81. I'm interested in @domoritz's opinion on this one :)
My opinion is that you should never, ever change the history of something in the main repo (not even on a branch). Better create a new pr. However, I'm for rebasing on external branches or private branches because this keeps the history cleaner.
I meant opinion on the feature, not on rebasing on their private branch ;)
Ahh. IMHO, parsing tables in PDFs is super difficult but would be really awesome. As long as someone who just wants simple csv parsing does not have to install pdfminer and everything, I am for this feature.
@rossjones We talked about this before: I think we should move the requirements, that are only important for certain features, to a requirements.text
file.
@domoritz Agreed on it being super difficult. We'll stick to this approach of PDF support being optional.
I agree, as long as it is only the optional requirements rather than the core ones I am all for it.
Also @paulfurley don't forget the changelog ;)
I'll get pdftables working on python 2.6 now and I'll give you a shout once I've rebased and modded the changelog :)
OK, tests passing and rebased, think we're good to go :) @rossjones
We've been exploring different options for parsing PDFs. Currently we're using an (alpha) in-house library called pdftables (we blogged about it here)
This pull request integrates pdftables into messytables. It is an optional requirement - if pdftables is not installed, messytables will work as usual and the PDF tests will be skipped.
We're looking into other ways of extracting tables from PDFs, but either way we'll need the messytables integration.