run-llama / llama_extract

MIT License
105 stars 16 forks source link

extractor.acreate_schema() raises ApiError: status_code: 500, body: Internal Server Error #21

Open n400peanuts opened 3 months ago

n400peanuts commented 3 months ago

Describe the bug

By running the notebook demo_pydantic_model.ipynb I encountered an ApiError when I run:

schema_response = await extractor.acreate_schema("Resume Schema", data_schema=Resume)

that is, when I try to enforce a Pydantic model previously created.

This is the full traceback (which gives also some info about how I was running the code):

`

JSONDecodeError Traceback (most recent call last) File ~/opt/anaconda3/envs/py312/lib/python3.12/site-packages/llama_cloud/resources/extraction/client.py:468, in AsyncExtractionClient.create_schema(self, name, project_id, data_schema) 467 try: --> 468 _response_json = _response.json() 469 except JSONDecodeError:

File ~/opt/anaconda3/envs/py312/lib/python3.12/site-packages/httpx/_models.py:764, in Response.json(self, kwargs) 763 def json(self, kwargs: typing.Any) -> typing.Any: --> 764 return jsonlib.loads(self.content, **kwargs)

File ~/opt/anaconda3/envs/py312/lib/python3.12/json/init.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 343 if (cls is None and object_hook is None and 344 parse_int is None and parse_float is None and 345 parse_constant is None and object_pairs_hook is None and not kw): --> 346 return _default_decoder.decode(s) 347 if cls is None:

File ~/opt/anaconda3/envs/py312/lib/python3.12/json/decoder.py:337, in JSONDecoder.decode(self, s, _w) 333 """Return the Python representation of s (a str instance 334 containing a JSON document). 335 336 """ --> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 338 end = _w(s, end).end()

File ~/opt/anaconda3/envs/py312/lib/python3.12/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx) 354 except StopIteration as err: --> 355 raise JSONDecodeError("Expecting value", s, err.value) from None 356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

ApiError Traceback (most recent call last) Cell In[5], line 1 ----> 1 schema_response = await extractor.acreate_schema("Resume Schema", data_schema=Resume)

File ~/opt/anaconda3/envs/py312/lib/python3.12/site-packages/llama_extract/base.py:261, in LlamaExtract.acreate_schema(self, name, data_schema, project_id) 256 else: 257 raise ValueError( 258 "data_schema must be either a dictionary or a Pydantic model" 259 ) --> 261 response = await self._async_client.extraction.create_schema( 262 name=name, data_schema=json_schema, project_id=project_id 263 ) 264 return response

File ~/opt/anaconda3/envs/py312/lib/python3.12/site-packages/llama_cloud/resources/extraction/client.py:470, in AsyncExtractionClient.create_schema(self, name, project_id, data_schema) 468 _response_json = _response.json() 469 except JSONDecodeError: --> 470 raise ApiError(status_code=_response.status_code, body=_response.text) 471 raise ApiError(status_code=_response.status_code, body=_response_json)

ApiError: status_code: 500, body: Internal Server Error `

Files The files are within the repository already as this was a demo notebook

Client: Please remove untested options:

Yuvraj-Takey commented 3 months ago

I'm also encountering the same error when executing examples/demo_json_schema.ipynb. ApiError: status_code: 500, body: Internal Server Error

However, I found a workaround: using the update_schema method will internally replace the old schema and return the new one schema. Here is the code sample extraction_schema = await extractor.aupdate_schema(extraction_schema.id, data_schema=userSchema)

(@n400peanuts You can replace userSchema with your data as Resume)

KannamSridharKumar commented 3 months ago

@Yuvraj-Takey

To be able to run the code you have suggested, we already need to have extraction_schema.id. But, we are getting the error when we run the extraction_schema first time.

Can you pls share your full code?

Thanks,

KannamSridharKumar commented 3 months ago

@Yuvraj-Takey thanks, i just figured it out. We first create the default schema, which is working fine. Then update it.

extraction_schema = await extractor.ainfer_schema("Test Schema", fpaths)

extraction_schema = await extractor.aupdate_schema(extraction_schema.id, data_schema=data_schema)

samarth777 commented 2 months ago

@Yuvraj-Takey thanks, i just figured it out. We first create the default schema, which is working fine. Then update it.

extraction_schema = await extractor.ainfer_schema("Test Schema", fpaths)

extraction_schema = await extractor.aupdate_schema(extraction_schema.id, data_schema=data_schema)

what is fpaths? how to create the default schema that works?