cdqa-suite / cdQA

⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
https://cdqa-suite.github.io/cdQA-website/
Apache License 2.0
612 stars 190 forks source link

Pdf converter showing Error #333

Open TobiKoledoye opened 4 years ago

TobiKoledoye commented 4 years ago

I get the following whenever i try to use the pdf converter. "Unexpected error: <class 'AttributeError'> Unable to process file 1q19-pr-12648.pdf"

tried it using the examples, same thing.

fmikaelian commented 4 years ago

Hi @TobiKoledoye

The pdf converter tutorial currently works, you can try it here.

Can you share the code you used and your pdf file so we can reproduce the bug?

suresh96458 commented 4 years ago

@fmikaelian I am facing the same issue. I have attached the pdf file and the code i am running in my command prompt is :

and after it even converts into a data frame it is not getting converted in an proper format, can you help me in this:

JD1.pdf


>>>from cdqa.utils.converters import pdf_converter
>>> import tika
>>> df = pdf_converter(directory_path='/home/xxxx/Downloads/data/')