Open StevenGuo42 opened 2 months ago
I've been able to reproduce your problem and it seems the results are highly dependent on the model
import dspy
from typing import Dict, List
from pydantic import BaseModel, Field
ollama_model = dspy.OllamaLocal(model="dolphin-mixtral:8x7b-v2.6", model_type="text")
dspy.settings.configure(lm=ollama_model)
class Input(BaseModel):
role: str = Field(description="The role and task of the assistant")
items: List[str] = Field(description="List of items to be mapped")
categories: List[str] = Field(description="List of categories to map the items to")
output_info: str = Field(description="The requirement of the output")
class Output(BaseModel):
mapping: Dict[str, str | None] = Field(
description="The mapping from the first variable to the second variable"
)
class QASignature(dspy.Signature):
input: Input = dspy.InputField()
output: Output = dspy.OutputField()
cot_predictor = dspy.TypedChainOfThought(QASignature)
predictor = dspy.TypedPredictor(QASignature)
input_data = Input(
role="You are a helpful assistant designed to remap coded items to coded categories. Your task is to map each item to one of the categories. ",
items=["apple", "banana", "tomato", "cabbage", "human"],
categories=["fruit", "vegetable"],
output_info="each item should be mapped to one of the categories or `null` if mapping is not possible.",
)
prediction = cot_predictor(input=input_data, options={"format": "json"})
# prediction = predictor(input=input_data)
print(prediction.output)
dolphin-mixtral:8x7b-v2.6 works extremely well from the first go.
On the other hand llama3 8b seems to get the task yet persists trying to come up with new schema and doesn't work even with default predictor.
I had to redefine forward
of the predictor to store all the completions made in process to get the results llama3 was making.
[Prediction(
reasoning='Input: {"role":"You are a helpful assistant designed to remap coded items to coded categories. Your task is to map each item to one of the categories. ","items":["apple","banana","tomato","cabbage","human"],"categories":["fruit","vegetable"],"output_info":"each item should be mapped to one of the categories or `null` if mapping is not possible."}\nReasoning: Let\'s think step by step in order to map each item to one of the categories. We will iterate over the items and check which category they belong to. For example, "apple" and "banana" are fruits, so we will map them to "fruit". "tomato" is a vegetable, so we will map it to',
output='Here\'s my response:\n\n```json\n{\n "mapping": {\n "apple": "fruit",\n "banana": "fruit",\n "tomato": "vegetable",\n "cabbage": "vegetable",\n "human": null\n }\n}\n```\n\nI hope this is what you were looking for!'
), Prediction(
reasoning='Here is the JSON object representing the output:\n\n```\n{\n "output": {\n "apple": {"category": "fruit"},\n "banana": {"category": "fruit"},\n "tomato": {"category": "vegetable"},\n "cabbage": {"category": "vegetable"},\n "human": null\n }\n}\n```',
output='{\n "output": {\n "apple": {"category": "fruit"},\n "banana": {"category": "fruit"},\n "tomato": {"category": "vegetable"},\n "cabbage": {"category": "vegetable"},\n "human": null\n }\n}'
), Prediction(
reasoning='Here is the JSON object representing the output:\n\n```\n{\n "mapping": {\n "apple": {"category": "fruit"},\n "banana": {"category": "fruit"},\n "tomato": {"category": "vegetable"},\n "cabbage": {"category": "vegetable"},\n "human": null\n }\n}\n```',
output='{\n "mapping": {\n "apple": {"category": "fruit"},\n "banana": {"category": "fruit"},\n "tomato": {"category": "vegetable"},\n "cabbage": {"category": "vegetable"},\n "human": null\n }\n}'
)]
I think making the model return correct schemas should be a part of optimizers processes. I'll try to make an example in a few days.
I'm wondering if there is a way to optimize the output schema only, so I won't need provide any training data for it.
It seams that DSPy only outputs JSON as plain text, even the model supports JSON schema. This is quite problematic, even the simple example from the documentation can't generate JSON properly on first try.
Why rely on model performance for adhering to structured output instead of constrained generation? Why not integrate with Outlines or SGLang?
+1 happens for normal usecases even using gpt4o. so definitely optimisations are needed.
I don't know if this can be useful to anyone, but here is how I proceeded to solve this problem.
First, I noticed that the outputs of LLMs are often truncated, so I systematically increased the value of max_tokens (I set it to 2048, which is sufficient for most of my use cases). I also noticed that the outputs of some LLMs (for example, Mixtral-8x7B-Instruct-v0.1 on Anyscale) start with a valid JSON object followed by unnecessary text. So I asked ChatGPT to write a function that extracts the first occurrence of a valid JSON object from a string. I added this function as a method to the TypedPredictor module as follows:
def extract_valid_json(self, input_string): """ Patch Christian Mauceri This function extracts a valid JSON object at the beginning of a string.
:param input_string: The input string
:return: A tuple containing the extracted JSON object and the remaining part of the string
"""
json_obj = None
end_index = -1
# Look for the end position of the valid JSON object
for i in range(len(input_string)):
try:
json_obj = json.loads(input_string[:i+1])
end_index = i + 1
except json.JSONDecodeError:
continue
# If a JSON object was found
if json_obj is not None:
return json_obj, input_string[end_index:]
else:
raise ValueError("No valid JSON object found at the beginning of the string.")
Then I added the following lines in the first 'try' block of the forward method, right after 'value = completion[name]':
valid_json_object, remaining = self.extract_valid_json(value)
value = json.dumps(valid_json_object) I am aware that this is quite an ugly patch, but it allows me to move forward with:
Claude OpenAI (GPT-4 and Turbo) Various serverless LLMs provided by Anyscale
In some cases the
TypedChainOfThought
would throw the following error, butTypedPredictor
would work fine.Reproducible example:
Error:
Using
TypedPredictor
:Result: