okfn / messytables

Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py
http://messytables.readthedocs.io/
387 stars 110 forks source link

Externalize PDF support? #137

Closed pudo closed 5 years ago

pudo commented 9 years ago

Hi all, I only returned to looking at messytables after a good while today (see #136). It strikes me that PDF parsing support is a deeply absurd thing to try and do in a utility library like messytables.

I've taken half a dozen table-containing PDFs now to throw them at the PDF tool and none of them gave me a useful result.

On the other hand, having whacky features like this makes messytables feel bloated and heavyweight.

I wonder if there would be a strong argument against building an entry_points mechanism and then putting the PDF parser in it's own module? That would then also manage dependencies properly, instead of soft-failing upon call.

davidread commented 5 years ago

@pudo Agreed. Finally got around to doing this. Can you or @StevenMaude review/approve? https://github.com/okfn/messytables/pull/186

davidread commented 5 years ago

Removed in #186