run-llama / llama_parse

Parse files for optimal RAG
https://www.llamaindex.ai
MIT License
1.87k stars 167 forks source link

Error while parsing the PDF file: Failed to parse the PDF file: Internal Server Error #30

Closed SmileLollipop closed 1 week ago

MeTaNoV commented 4 months ago

very descriptive! :)

anoopshrma commented 4 months ago

Hi @SmileLollipop There could be some issue at llamacloud. Could you try again and check if you facing the same issue or not

httplups commented 4 months ago

I am facing the same issue:

Error while parsing the PDF file: Failed to parse the PDF file: {"detail":[{"loc":["body","language",0],"msg":"value is not a valid enumeration member; permitted: 'af', 'az', 'bs', 'cs', 'cy', 'da', 'de', 'en', 'es', 'et', 'fr', 'ga', 'hr', 'hu', 'id', 'is', 'it', 'ku', 'la', 'lt', 'lv', 'mi', 'ms', 'mt', 'nl', 'no', 'oc', 'pi', 'pl', 'pt', 'ro', 'rs_latin', 'sk', 'sl', 'sq', 'sv', 'sw', 'tl', 'tr', 'uz', 'vi', 'ar', 'fa', 'ug', 'ur', 'bn', 'as', 'mni', 'ru', 'rs_cyrillic', 'be', 'bg', 'uk', 'mn', 'abq', 'ady', 'kbd', 'ava', 'dar', 'inh', 'che', 'lbe', 'lez', 'tab', 'tjk', 'hi', 'mr', 'ne', 'bh', 'mai', 'ang', 'bho', 'mah', 'sck', 'new', 'gom', 'sa', 'bgc', 'th', 'ch_sim', 'ch_tra', 'ja', 'ko', 'ta', 'te', 'kn'","type":"type_error.enum","ctx":{"enum_values":["af","az","bs","cs","cy","da","de","en","es","et","fr","ga","hr","hu","id","is","it","ku","la","lt","lv","mi","ms","mt","nl","no","oc","pi","pl","pt","ro","rs_latin","sk","sl","sq","sv","sw","tl","tr","uz","vi","ar","fa","ug","ur","bn","as","mni","ru","rs_cyrillic","be","bg","uk","mn","abq","ady","kbd","ava","dar","inh","che","lbe","lez","tab","tjk","hi","mr","ne","bh","mai","ang","bho","mah","sck","new","gom","sa","bgc","th","ch_sim","ch_tra","ja","ko","ta","te","kn"]}}]}

When I tried to load a document with LLama Parse using load_data function: from llama_parse import LlamaParse # pip install llama-parse

parser = LlamaParse(
    api_key=os.environ["LLAMA_CLOUD_API_KEY"],  # can also be set in your env as LLAMA_CLOUD_API_KEY
    result_type="markdown"  # "markdown" and "text" are available
)

documents = parser.load_data("example.pdf")

The PDF is fine, I am loaded it with other tools.

I am using the following versions:

llama-index                              0.10.15
llama-index-agent-openai                 0.1.5
llama-index-cli                          0.1.7
llama-index-core                         0.10.15
llama-index-embeddings-openai            0.1.6
llama-index-indices-managed-llama-cloud  0.1.3
llama-index-legacy                       0.9.48
llama-index-llms-openai                  0.1.7
llama-index-multi-modal-llms-openai      0.1.4
llama-index-program-openai               0.1.4
llama-index-question-gen-openai          0.1.3
llama-index-readers-file                 0.1.6
llama-index-readers-llama-parse          0.1.3
llama-index-vector-stores-chroma         0.1.5
llama-parse                              0.3.5
MeTaNoV commented 4 months ago

parser = LlamaParse( api_key=os.environ["LLAMA_CLOUD_API_KEY"], # can also be set in your env as LLAMA_CLOUD_API_KEY result_type="markdown" # "markdown" and "text" are available )

try to add a language like language='en' as a workaround to set a proper language value as param to LlamaParse call

httplups commented 4 months ago

parser = LlamaParse( api_key=os.environ["LLAMA_CLOUD_API_KEY"], # can also be set in your env as LLAMA_CLOUD_API_KEY result_type="markdown" # "markdown" and "text" are available )

try to add a language like language='en' as a workaround to set a proper language value as param to LlamaParse call

Thank you. It worked!

dominicdev commented 2 weeks ago

getting this error tried adding langguage but sill getting same error, any fixed?

anoopshrma commented 2 weeks ago

getting this error tried adding langguage but sill getting same error, any fixed?

What is your llama-parse lib version. I would suggest lib upgrade pip install -U llama-parse And then try with providing language.

nikky78 commented 4 days ago

I have the llama-parse 0.4.6, and get the same error