turicas / rows

A common, beautiful interface to tabular data, no matter the format
GNU Lesser General Public License v3.0
865 stars 136 forks source link

Could not find import_from_pdf function #292

Closed marcellalves closed 6 years ago

marcellalves commented 6 years ago

I need to import data from pdf and found this example: https://gist.github.com/turicas/6b9ca83dcd531a6cd4fd87ced2a28c70

But I was unable to run it, since the import_from_pdf is not available to me.

I have already run the command: pip install rows[all]

Is pdf format no longer supported?

turicas commented 6 years ago

The PDF support is not merged into develop branch, so you need to install it from the feature/plugin-pdf git branch by running:

pip install -U git+https://github.com/turicas/rows.git@feature/plugin-pdf#egg=rows

You also need to install the PDF requirements:

pip install pdfminer.six cache-property
mmdfmateus commented 5 years ago

i just installed the pdf branch and still get the message ' _module 'rows' has no attribute 'import_frompdf''

here a snap from my pip freeze image

what am i missing? do i have to import it using a different name?

turicas commented 5 years ago

You also need to install two libraries: pdfminer.six and cached-property. Did you install them? The function import_from_pdf will be available if no ImportError is raised when importing rows/plugins/plugin_pdf.py.

mmdfmateus commented 5 years ago

@turicas thanks for answering that fast! hahahah just installed these libraries and did work perfectly with the example above in this post. But when i tried with this url i get the following error:

image

I also tried _startsafter and _endsbefore trying to match the exact table in the pdf. But maybe is just me being newbie with the lib! :P amazing work btw ahahahah