run-llama / llama_parse

Parse files for optimal RAG
https://www.llamaindex.ai
MIT License
2.74k stars 263 forks source link

Error while parsing the PDF file: #59

Closed willkhoza closed 2 months ago

willkhoza commented 7 months ago

Error while parsing the PDF file: Failed to parse the PDF file: {"detail":[{"loc":["body","language",0],"msg":"value is not a valid enumeration member; permitted: 'af', 'az', 'bs', 'cs', 'cy', 'da', 'de', 'en', 'es', 'et', 'fr', 'ga', 'hr', 'hu', 'id', 'is', 'it', 'ku', 'la', 'lt', 'lv', 'mi', 'ms', 'mt', 'nl', 'no', 'oc', 'pi', 'pl', 'pt', 'ro', 'rs_latin', 'sk', 'sl', 'sq', 'sv', 'sw', 'tl', 'tr', 'uz', 'vi', 'ar', 'fa', 'ug', 'ur', 'bn', 'as', 'mni', 'ru', 'rs_cyrillic', 'be', 'bg', 'uk', 'mn', 'abq', 'ady', 'kbd', 'ava', 'dar', 'inh', 'che', 'lbe', 'lez', 'tab', 'tjk', 'hi', 'mr', 'ne', 'bh', 'mai', 'ang', 'bho', 'mah', 'sck', 'new', 'gom', 'sa', 'bgc', 'th', 'ch_sim', 'ch_tra', 'ja', 'ko', 'ta', 'te', 'kn'","type":"type_error.enum","ctx":{"enum_values":["af","az","bs","cs","cy","da","de","en","es","et","fr","ga","hr","hu","id","is","it","ku","la","lt","lv","mi","ms","mt","nl","no","oc","pi","pl","pt","ro","rs_latin","sk","sl","sq","sv","sw","tl","tr","uz","vi","ar","fa","ug","ur","bn","as","mni","ru","rs_cyrillic","be","bg","uk","mn","abq","ady","kbd","ava","dar","inh","che","lbe","lez","tab","tjk","hi","mr","ne","bh","mai","ang","bho","mah","sck","new","gom","sa","bgc","th","ch_sim","ch_tra","ja","ko","ta","te","kn"]}}]}

MeTaNoV commented 7 months ago

try to add a language like language='en' to Llamaparse call as a workaround to set a proper language value

hexapode commented 7 months ago

This was fixed with https://github.com/run-llama/llama_parse/pull/60

If you update your llama_parse package it should now work as expected.