Closed kreativitat closed 20 hours ago
Hey! Definitely. I think one thing is a optimized prompt - for example including json schema which from my exp. reduces this situation significantly
The second option I guess would be to add some output validator (eg using pydantic)?
Do you see any actionable items you'd like to see next out of this issue? I mean we can work on some specific example to optimize the prompt or maybe you'd like to create a FR (?)
Let's make it more actionable :)
The other option is to add output format parameter: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion
It supports json.
Please maybe try this out and create a PR or FR?
I think we could just add proxy parameter output_format to the API and CLI to be used along when prompt is provided
I've just tried the following one and it worked pretty well:
(.venv) piotrkarwatka@Piotrs-MacBook-Pro-2 pdf-extract-api % python client/cli.py ocr_request --file examples/example-mri.pdf --ocr_cache --prompt "Return only JSON format"
/Users/piotrkarwatka/Projects/pdf-extract-api/client/.venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
warnings.warn(
File uploaded successfully. Task Id: ac037626-021a-4d48-a34c-89c6fe4b3168 Waiting for the result...
{'state': 'PENDING', 'status': 'Task is pending...'}
{'state': 'PROGRESS', 'status': 'Processing LLM', 'info': {'progress': 75, 'status': 'Processing LLM', 'elapsed_time': 2.1235079765319824}}
{'state': 'PROGRESS', 'status': 'LLM Processing chunk no: 35', 'info': {'progress': 34, 'status': 'LLM Processing chunk no: 34', 'elapsed_time': 4.134450912475586}}
{'state': 'PROGRESS', 'status': 'LLM Processing chunk no: 125', 'info': {'progress': 125, 'status': 'LLM Processing chunk no: 125', 'elapsed_time': 6.150297164916992}}
{'state': 'PROGRESS', 'status': 'LLM Processing chunk no: 213', 'info': {'progress': 213, 'status': 'LLM Processing chunk no: 213', 'elapsed_time': 8.164186954498291}}
```json
{
"address": {
"street1": "0 Maywood Ave.",
"city": "Maywood",
"state": "NJ",
"zip": "0000"
},
"practice": {
"name": "Ikengil Radiology Associa",
"website": "DikengilRadiologyAssociates.com",
"phone": "201-725-0913"
},
"patient": {
"names": "Jane, Mary",
"dob": "1966-00-00",
"age": 55,
"sex": "F",
"accountNumber": "00002"
},
"study": {
"type": "Brain MRI",
"dateOfService": "2021-04-29"
},
"diagnosis": {
"condition": "Chiari I malformation with 10 mm descent of cerebellar tonsils."
},
"imagingTechnique": {
"description": "Noncontrast MRI of the brain was performed in the three orthogonal planes utilizing T1/T2/T2 FLAIR/T2* GRE/Diffusion-ADC sequences."
}
}
When utilizing Large Language Models to extract data from documents such as invoices and generate structured outputs like JSON files, a common issue arises: the LLM does not always adhere strictly to the provided fields and sometimes invents new ones. This behavior poses significant challenges for applications that require exact data formats for database integration and other automated processes.