stanfordnlp / dspy

DSPy: The framework for programming—not prompting—language models
https://dspy.ai
MIT License
18.92k stars 1.45k forks source link

Serialization of LLM params and local\global LLM assignment #159

Open Puzer opened 1 year ago

Puzer commented 1 year ago

I want to understand plans for serialization and deserialization of LLMs from state.

If I try to dump_state() of a Predictor or Module I see that conceptually you've been thinking about this use-case. From what I get it's not implemented yet (e.g. I can assign LLM for some module\predictor e.g. by module.predictors()[0].lm = dspy.OpenAI(...), but when I try to save state, JSON of course can't serialize OpenAI class)

Why it's important:

In general for now I assume that the main use case is to use "global" and avoid using "local" assignment of LLMs. E.g. there is a simple code where "local" LLM didn't propagate into compiled model properly.

gpt_35_chat = dspy.OpenAI(model='gpt-3.5-turbo', **defult_params)
# we don't use "global" LLM like dspy.settings.configure(lm=gpt_35_chat)

class MathSolverSignature(dspy.Signature):
    """Perform math operation"""
    query = dspy.InputField(desc="Mathematical expression")
    answer = dspy.OutputField(desc="Answer to the mathematical expression")

class MathSolverModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate_answer = dspy.ChainOfThought(MathSolverSignature, lm=gpt_35_chat) # assign "local" LLM

    def forward(self, **kwargs):
        prediction = self.generate_answer(**kwargs)
        return prediction

dataset = [{"query":"2+2=", "answer":"4"}, {"query":"5*5=", "answer":"25"}, {"query":"0+1=", "answer":"1"}]
dataset = list(map(lambda x: dspy.Example(**x).with_inputs("query"), dataset))

eval_func = lambda a,b,_: a.answer == b.answer

model = MathSolverModule()
model.predictors()[0].lm = gpt_35_chat # one more time, just to be sure :)

teleprompter = BootstrapFewShot(metric=eval_func, max_bootstrapped_demos=1, max_labeled_demos=2)
compiled_model = teleprompter.compile(model, trainset=dataset) # 

test_example_results = compiled_model(query="2+2=") # FAIL
compiled_model.save("trained_model.json")
okhat commented 1 year ago

The plan is to support full serialization properly for all of this ASAP, but we haven’t exactly explored multi-LM programs.

setting .lm of a module right now is… not documented so it actually has odd side effects, be careful, but we can easily spend time on this right now if you need it

okhat commented 1 year ago

The effective thing to do now is

with dspy.settings.context(lm=…): …. Inside this block, the lm is different (you can also nest this pattern)

Puzer commented 1 year ago

The effective thing to do now is

with dspy.settings.context(lm=…): …. Inside this block, the lm is different (you can also nest this pattern)

Yep, it works.

I tried to assign Predictor.lm manually, it might work fine for my case, but I've got unexpected behaviour due to query_only=True.

May I ask you to explain what is use-case of using query_only=True ? I understand what it do (skips demos and guidelines), but why ? :) https://github.com/stanfordnlp/dspy/blob/54a60140b0e5419d6d2ce1e3d835faba1c6ceccb/dspy/predict/predict.py#L89