run-llama / llama_parse

Parse files for optimal RAG
https://www.llamaindex.ai
MIT License
2.98k stars 283 forks source link

Issue with Llamaparse - Incompatibility with Certain Excel Documents #470

Closed paulo16 closed 2 hours ago

paulo16 commented 3 hours ago

Describe the bug I’m experiencing an issue with Llamaparse when trying to process specific Excel files. Each time I attempt to parse these documents, I receive an error message similar to the one below: Started parsing the file under job_id 6a04d858-34ae-4eb6-a11f-18a4886049e5 Error while parsing the file '/content/drive/MyDrive/data/test_pdf/reports--54859-BB-attach-8bc01879-8bd2-4f2c-97b3-e08337a56ebf.xlsx': Job ID: 6a04d858-34ae-4eb6-a11f-18a4886049e5 failed with status: ERROR, Error code: UNSUPPORTED_FILE_TYPE, Error message: Unsupported file type.

Files file is provided in attachments reports--54859-BB-attach-8bc01879-8bd2-4f2c-97b3-e08337a56ebf.xlsx

Job ID job_id 6a04d858-34ae-4eb6-a11f-18a4886049e5

Client: I have tested 3 clients :

Additional context In Notebook python client , my script : import nest_asyncio nest_asyncio.apply()

parser = LlamaParse( api_key="xxxx", result_type="markdown", language="fr", verbose=True ) json_objs_excel2 = parser.get_json_result("/content/drive/MyDrive/data/test_pdf/reports--54859-BB-attach-8bc01879-8bd2-4f2c-97b3-e08337a56ebf.xlsx")

hexapode commented 2 hours ago

Hi! It should work now, we had an issue with some excel with long cell content.

paulo16 commented 2 hours ago

Thank you for your responsiveness; indeed, the bug has been fixed.