Necessary for running async code in notebooks or scripts
nest_asyncio.apply()
# Initialize the LlamaParse parser
parser = LlamaParse(
api_key=llamaindex_api_key,
result_type="markdown", # Choose "markdown" as the output format
verbose=True, # Enable verbose output to see detailed logs
)
# Define the path to your PDF file
pdf_file_path = os.path.join("./papers/", file_name)
print(pdf_file_path, "type:", type(pdf_file_path))
# Convert the PDF to Markdown
# This is a synchronous call, you can also use asynchronous calls as shown in the documentation
documents = parser.load_data(pdf_file_path)
# Return the converted documents
return documents
Terminal:
./papers/Retrieval-Augmented_Generation_for_Knowledge-Intensive_NLP_Tasks.pdf type: <class 'str'>
Started parsing the file under job_id 6b58df53-9d59-4286-8e88-427e3e93d956
An error occurred while updating paper rowid 77: Error binding parameter 0 - probably unsupported type.
./papers/RA-DIT_Retrieval-Augmented_Dual_Instruction_Tuning.pdf type: <class 'str'>
Started parsing the file under job_id fe3b8c19-0248-431a-a497-30c119a0694e
An error occurred while updating paper rowid 79: Error binding parameter 0 - probably unsupported type.
./papers/ColBERT_Efficient_and_Effective_Passage_Search_via_Contextualized_Late
__Interaction_over_BERT.pdf type: <class 'str'>
Started parsing the file under job_id c97fc58f-9e15-4e38-819e-90a0b140bfb8
An error occurred while updating paper rowid 80: Error binding parameter 0 - probably unsupported type.
./papers/Lost_in_the_Middle_How_Language_Models_Use_Long_Contexts.pdf type: <class 'str'>
Started parsing the file under job_id 16b4aee0-25a4-4f7f-926d-4d96a779567b
An error occurred while updating paper rowid 81: Error binding parameter 0 - probably unsupported type.
./papers/Enhancing_Recommender_Systems_with_Large_Language_Model_Reasoning_Graphs.pdf type: <class 'str'>
Started parsing the file under job_id a35e8ce8-cae6-4885-8e59-fcfecea99f59
An error occurred while updating paper rowid 84: Error binding parameter 0 - probably unsupported type.
./papers/LIMA_Less_Is_More_for_Alignment.pdf type: <class 'str'>
Started parsing the file under job_id 51a97a81-db5b-4432-951d-429a403620dc
An error occurred while updating paper rowid 85: Error binding parameter 0 - probably unsupported type.
./papers/Retrieval-Augmented_Generation_for_Knowledge-Intensive_NLP_Tasks.pdf type: <class 'str'>
Started parsing the file under job_id 1298bad1-3537-40f8-9563-641d53ef852c
An error occurred while updating paper rowid 92: Error binding parameter 0 - probably unsupported type.
./papers/Retrieval-Augmented_Generation_for_Large_Language_Models_A_Survey.pdf type: <class 'str'>
Started parsing the file under job_id 36e20f5d-2dd9-466d-bf6d-4c136d684ef9
An error occurred while updating paper rowid 102: Error binding parameter 0 - probably unsupported type.
Hi there, for some reason some of my pdfs are not processed, I get an error.
Reference code: def convert_pdf_to_markdown(file_name):
Necessary for running async code in notebooks or scripts
Terminal: ./papers/Retrieval-Augmented_Generation_for_Knowledge-Intensive_NLP_Tasks.pdf type: <class 'str'> Started parsing the file under job_id 6b58df53-9d59-4286-8e88-427e3e93d956 An error occurred while updating paper rowid 77: Error binding parameter 0 - probably unsupported type. ./papers/RA-DIT_Retrieval-Augmented_Dual_Instruction_Tuning.pdf type: <class 'str'> Started parsing the file under job_id fe3b8c19-0248-431a-a497-30c119a0694e An error occurred while updating paper rowid 79: Error binding parameter 0 - probably unsupported type. ./papers/ColBERT_Efficient_and_Effective_Passage_Search_via_Contextualized_Late __Interaction_over_BERT.pdf type: <class 'str'> Started parsing the file under job_id c97fc58f-9e15-4e38-819e-90a0b140bfb8 An error occurred while updating paper rowid 80: Error binding parameter 0 - probably unsupported type. ./papers/Lost_in_the_Middle_How_Language_Models_Use_Long_Contexts.pdf type: <class 'str'> Started parsing the file under job_id 16b4aee0-25a4-4f7f-926d-4d96a779567b An error occurred while updating paper rowid 81: Error binding parameter 0 - probably unsupported type. ./papers/Enhancing_Recommender_Systems_with_Large_Language_Model_Reasoning_Graphs.pdf type: <class 'str'> Started parsing the file under job_id a35e8ce8-cae6-4885-8e59-fcfecea99f59 An error occurred while updating paper rowid 84: Error binding parameter 0 - probably unsupported type. ./papers/LIMA_Less_Is_More_for_Alignment.pdf type: <class 'str'> Started parsing the file under job_id 51a97a81-db5b-4432-951d-429a403620dc An error occurred while updating paper rowid 85: Error binding parameter 0 - probably unsupported type. ./papers/Retrieval-Augmented_Generation_for_Knowledge-Intensive_NLP_Tasks.pdf type: <class 'str'> Started parsing the file under job_id 1298bad1-3537-40f8-9563-641d53ef852c An error occurred while updating paper rowid 92: Error binding parameter 0 - probably unsupported type. ./papers/Retrieval-Augmented_Generation_for_Large_Language_Models_A_Survey.pdf type: <class 'str'> Started parsing the file under job_id 36e20f5d-2dd9-466d-bf6d-4c136d684ef9 An error occurred while updating paper rowid 102: Error binding parameter 0 - probably unsupported type.