Closed pk-lit closed 1 day ago
Thanks for the report @pk-lit !
Hi @pk-lit, Are you using the latest versions of unstructured(0.14.5) and unstructured-inference(0.7.34) libraries? I did not get those errors in those versions.
$ pip install unstructured -U
$ pip install unstructured-inference -U
def extract_text_by_page(pdf_path): """Extracts text from each page of a PDF file using unstructured.io.""" document = partition_pdf(pdf_path) pages_text = [page.page_content for page in document.pages] return pages_text
results in this error:
this is solved by specifying a strategy as a param. good to add an error message to this effect versus having to go down a weird env var rabbit hole.
cheers!