run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.39k stars 4.98k forks source link

[Bug]: 500 Error when trying to create schema for LlamaExtract #15110

Open rsdrahat opened 1 month ago

rsdrahat commented 1 month ago

Bug Description

Was working before. But now when trying to create a new schema for llamaextract, I get a 500 error using the python library.

Version

llama-index-core 0.10.59

Steps to Reproduce

Just run the below code with a valid API key in the .env file:


from pydantic import BaseModel, Field
from llama_extract import LlamaExtract

from dotenv import load_dotenv
import os
load_dotenv()

extractor = LlamaExtract()

class ResumeMetadata(BaseModel):
    """Resume metadata."""

    years_of_experience: int = Field(..., description="Number of years of work experience.")
    highest_degree: str = Field(..., description="Highest degree earned (options: High School, Bachelor's, Master's, Doctoral, Professional")
    professional_summary: str = Field(..., description="A general summary of the candidate's experience")

extraction_schema = extractor.create_schema("Test Schema", ResumeMetadata)

Relevant Logs/Tracbacks

File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_cloud/resources/extraction/client.py", line 468, in create_schema
    _response_json = _response.json()
                     ^^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/httpx/_models.py", line 764, in json
    return jsonlib.loads(self.content, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rsdrahat/dev/llama-pdf-extract/extract.py", line 30, in <module>
    extraction_schema = extractor.create_schema("Test Pydantic 2", BankStatement)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_extract/base.py", line 274, in create_schema
    return asyncio_run(self.acreate_schema(name, data_schema, project_id))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_index/core/async_utils.py", line 33, in asyncio_run
    return loop.run_until_complete(coro)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_extract/base.py", line 261, in acreate_schema
    response = await self._async_client.extraction.create_schema(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_cloud/resources/extraction/client.py", line 470, in create_schema
    raise ApiError(status_code=_response.status_code, body=_response.text)
llama_cloud.core.api_error.ApiError: status_code: 500, body: Internal Server Error
(venv) rsdrahat@Rahats-MacBook-Pro llama-pdf-extract % python schema_test.py
Traceback (most recent call last):
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_cloud/resources/extraction/client.py", line 468, in create_schema
    _response_json = _response.json()
                     ^^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/httpx/_models.py", line 764, in json
    return jsonlib.loads(self.content, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rsdrahat/dev/llama-pdf-extract/schema_test.py", line 17, in <module>
    extraction_schema = extractor.create_schema("Test Schema", ResumeMetadata)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_extract/base.py", line 274, in create_schema
    return asyncio_run(self.acreate_schema(name, data_schema, project_id))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_index/core/async_utils.py", line 33, in asyncio_run
    return loop.run_until_complete(coro)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_extract/base.py", line 261, in acreate_schema
    response = await self._async_client.extraction.create_schema(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rsdrahat/dev/llama-pdf-extract/venv/lib/python3.12/site-packages/llama_cloud/resources/extraction/client.py", line 470, in create_schema
    raise ApiError(status_code=_response.status_code, body=_response.text)
llama_cloud.core.api_error.ApiError: status_code: 500, body: Internal Server Error
dosubot[bot] commented 1 month ago

Hey @rsdrahat! 👋 I'm here to assist you with any bugs, questions, or contributions you have while we wait for a human maintainer. Let's squash those bugs together!

To resolve the 500 Internal Server Error when trying to create a new schema for LlamaExtract using the Python library, ensure that the function you are passing to create_schema_from_function is correctly defined and that all required parameters are properly annotated. Additionally, verify that any additional fields provided are in the correct format.

Here is an example of how you can use the create_schema_from_function function correctly:

from llama_index.legacy.tools.utils import create_schema_from_function
from typing import Any

# Define a sample function with proper annotations
def sample_function(param1: int, param2: str = "default") -> Any:
    pass

# Create the schema from the function
schema = create_schema_from_function("SampleSchema", sample_function)

# Optionally, you can add additional fields
additional_fields = [
    ("extra_field1", int, 42),
    ("extra_field2", str)
]

# Create the schema with additional fields
schema_with_additional_fields = create_schema_from_function("SampleSchemaWithExtras", sample_function, additional_fields)

Ensure that:

  1. The function sample_function has all parameters annotated.
  2. Any additional fields are provided as tuples of length 2 or 3.

The create_schema_from_function function processes the function's parameters and any additional fields to create a schema. If the error persists, it might be due to an issue with the server configuration or the way the schema is being used in the server context, which would require further investigation [1][2].

To continue talking to Dosu, mention @dosu.