I have converted few .text files into .PDF files and then I am running the 'PDF_converter' function to extract paragraphs and convert into a data frame while doing the same , I am unclear whether the issue faced is due to TIKA or my files , as a sample i am attaching two PDF files that i am using and also the error which i am facing.
I have converted few .text files into .PDF files and then I am running the 'PDF_converter' function to extract paragraphs and convert into a data frame while doing the same , I am unclear whether the issue faced is due to TIKA or my files , as a sample i am attaching two PDF files that i am using and also the error which i am facing.
` $ df = pdf_converter(directory_path='/home/xxxx/Downloads/test/')
2020-03-12 11:43:42,470 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2020-03-12 11:43:47,476 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2020-03-12 11:43:52,480 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2020-03-12 11:43:57,486 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2020-03-12 11:43:57,489 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer. Unexpected error: <class 'RuntimeError'> Unable to process file NetworkEngineer1.pdf`
NetworkEngineer1.pdf NetworkEngineer2.pdf
@andrelmfarias