Closed sharadregoti closed 1 year ago
Thanks for sharing the detailed instructions @sharadregoti!
I tried this rail spec which worked for me -- can you give it a shot:
<rail version="0.1">
<output>
<integer name="epf" description="Employee Provident Fund Amount (EPF) per annum" />
<integer name="gratuity" description="Gratuity per annum" />
<integer name="medialInsurance" description="Medical Insurance per annum" />
<integer name="termInsurance" description="Term Insurance per annum" />
<integer name="ctc" description="Cost To Company per annum" />
<object name="miscellaneous" description="Cost To Company per annum">
</object>
</output>
<prompt>
I have shared sample data of offer letter which has a CTC amount and it's breakdown in it
{{table}}
@complete_json_suffix_v2
</prompt>
</rail>
The main change is in the spec, I changed the output type from string
to integer
for all dictionary values. This corrected the parser error, which led to validation proceeding as expected.
I had tried changing string
to integer
. But bad luck Same result.
There is this jsondecoder error, can this be an issue
-- message: {"'output'": '\'{\\n "epf": 21,600,\\n "gratuity": 35,760,\\n "medialInsurance": 3,060,\\n "termInsurance": 3,672,\\n "ctc": 1,583,548,\\n "miscellaneous": {\\n "HRA": 371,748,\\n "specialAllowance": 305,148,\\n "internet": 30,000,\\n "technicalBooks": 10,000,\\n "giftVoucher": 5,004,\\n "pvAmount": 53,112,\\n "ctcMedicalPrem": 948\\n }\\n}\'', "'output_as_dict'": 'None', "'error'": "JSONDecodeError('Expecting property name enclosed in double quotes: line 2 column 15 (char 16)')", "'timestamp'": '1684915582.1477263', "'task_uuid'": "'28d0be01-469f-470a-ba1b-d4fead86aa14'", "'task_level'": '[2, 2, 3, 2]', "'message_type'": "'info'"}
@sharadregoti can you try this rail spec?
<rail version="0.1">
<output>
<integer name="epf" description="Employee Provident Fund Amount (EPF) per annum" />
<integer name="gratuity" description="Gratuity per annum" />
<integer name="medialInsurance" description="Medical Insurance per annum" />
<integer name="termInsurance" description="Term Insurance per annum" />
<integer name="ctc" description="Cost To Company per annum" />
<object name="miscellaneous" description="Cost To Company per annum">
</object>
</output>
<instructions>
You are a helpful assistant only capable of communicating with valid JSON, and no other text.
@json_suffix_prompt_examples
</instructions>
<prompt>
I have shared sample data of offer letter which has a CTC amount and it's breakdown in it.
{{table}}
If extracting any integer value, make sure to extract it as a number and not as a string.
This means that if the value is 1,00,000, then it should be extracted as 100000 and not as 1,00,000.
If you are unable to extract any value, use `null`.
@xml_prefix_prompt
{output_schema}
</prompt>
</rail>
I did some prompt engineering to make it work with your table. The issue that was happening was that the source table had numbers with commas, which was messing with the json decoding.
Also, when you use this, I recommend using this with temperature 0.0.
I tested this out with gpt-3, gpt-3.5 and gpt-4, and it worked across all 3.
# GPT-3
raw_llm_output, validated_output = guard(
openai.Completion.create,
prompt_params={"table": table},
engine="text-davinci-003",
max_tokens=1024,
temperature=0.0,
)
# GPT-3.5
raw_llm_output, validated_output = guard(
openai.ChatCompletion.create,
prompt_params={"table": table},
model="text-davinci-003",
max_tokens=1024,
temperature=0.0,
)
Closed due to inactivity. Feel free to reopen.
Describe the bug I am guardrails-ai for small python project. I have followed the getting started guide & modifed the .rails spec & prompt as per my requirement.
The below code snippet take from getting started guide
Print the validated output from the LLM print(validated_output) :
Outputs None
Ideally i wanted JSON stringPrints
None
on the stdoutWhen i view the logs, I found that guardrails-ai was able to get the output from LLM but was not able to give it back to my python code, logs showed me this error of module 'numpy' has no attribute 'bool'.
To Reproduce Steps to reproduce the behavior:
import os import tabula import openai import guardrails as gd
Get the path to the PDF file
pdf_file_path = "/home/sharad/personal/test-python-salary-gpt/test.pdf"
Extract the table from the PDF file
table = tabula.read_pdf(pdf_file_path)
promt=""" ${table}"""
print(promt.format(table=table))
guard = gd.Guard.from_rail('spec.rail')
Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = ""
Wrap the OpenAI API call with the
guard
objectraw_llm_output, validated_output = guard( openai.Completion.create, prompt_params={"table": promt.format(table=table)}, engine="text-davinci-003", max_tokens=1024, temperature=0.5, )
Print the validated output from the LLM
print(validated_output)