Using two models at once and passing the state of one to another? (more general `gen`, `select`, etc.)

ibehnam commented 1 month ago

Based on the examples and the way gen is defined, it looks like we can't directly pass the lm arg:

ERROR:

_decorator.<locals>.wrapped(*args, **kwargs)
     [54](.../.venv/lib/python3.12/site-packages/guidance/_guidance.py:54)     f._self_call_placeholder_ = Placeholder()
     [56](.../.venv/lib/python3.12/site-packages/guidance/_guidance.py:56) # call the function to get the grammar node
---> [57](.../.venv/lib/python3.12/site-packages/guidance/_guidance.py:57) node = f(_null_grammar, *args, **kwargs)
     [58](.../.venv/lib/python3.12/site-packages/guidance/_guidance.py:58) if not isinstance(node, (Terminal, str)):
     [59](.../.venv/lib/python3.12/site-packages/guidance/_guidance.py:59)     node.name = f.__name__

TypeError: gen() got multiple values for argument 'lm'

It seems like the only way to use gen is in an "addition" statement like this:

This works, although Pylance/Pyright still says:

ERROR:

Argument missing for parameter "lm"PylancereportCallIssue

I was wondering if we can use gen using its lm arg. It would be great because then we could have something like this:

prompt = "..."
lm_1 += prompt
lm_2 += prompt
lm_1_response = gen(lm=lm_1)
lm_2_response = gen(lm=lm_2)
lm_1 += lm_1_response + " but the other model says " + lm_2_response

Currently, this will throw an error:

TypeError: gen() got multiple values for argument 'lm'

And while we're at it, it looks like _state also includes HTML tags. Is there a way to retrieve just the conversation history of a lm object and pass it to another new_lm?

ibehnam commented 1 month ago

Another use case for this feature is using a bigger LLM to run heavy computation. For example, in the readme there's this example:

lm = llama2 + '''\
1 + 1 = add(1, 1) = 2
2 - 3 = subtract(2, 3) = -1
'''
lm + gen(max_tokens=15, tools=[add, subtract, multiply, divide])

I want to delegate the tool use to another LLM. How is it possible in guidance? So far, all my efforts lead to the errors mentioned above.

Harsha-Nori commented 1 month ago

Hi @ibehnam, one easy way to do this is just to write a guidance function that you can attach to several models. So taking your example, I'd just write something like (excuse syntax errors, am mobile):

@guidance
def behnam_func(lm, prompt):
     lm += prompt + gen("response", max_tokens=100)
     return lm

# then you can do:
lm_1 = guidance.models.Model1()
lm_2 = guidance.models.Model2()

prompt = "..."
lm_1 += behnam_func(prompt)
lm_2 += behnam_func(prompt)

print(f"LM_1 says {lm_1['response']}  but the other model says  {lm_2['response']}")

Does that make sense? I tried to stylistically match it to the way you wrote your example. For tools, you can write functions that invoke other LM calls as part of them -- they can just be arbitrary python functions, including other guidance functions. We think of writing guidance as writing python for the most part.

ibehnam commented 1 month ago

@Harsha-Nori Thanks for your response. I appreciate that you took time to answer even when you were mobile.

I think I need to clarify what I meant: The goal is to make a call to lm_1, get the response, and add it to lm_2's state. Currently, the only way I can do this is by naming the output of lm_1, and then adding that to lm_2.

More broadly, it looks like one can't use the gen function in one go like this:

lm_1 += "What is the capital of France?"
lm_2 += What is the capital of France?"
lm_2 += "The other model say" + gen(lm=lm_1) + "And I " select(lm=lm_2, ["agree", "don't agree"])

Harsha-Nori commented 1 month ago

Hmm, that's interesting, but I think it'd be quite tough to support this specific of syntax with the way guidance is architected today. For some intuition about how guidance works, we collapse all the concatenations (strings, gen, select, json, etc. etc.) into a single grammar before we efficiently execute it on the LM.

Having the base gen function eagerly execute on a different object really changes the semantics of how grammars are meant to be processed, and how they get combined with strings and other grammars. Letting the first lm arg be an object different to what we use the __add__ semantics for would require a significant rethinking of the codebase (and possibly breaking even more python internals than we do 😓 ) in order to track all the state properly and ensure the right order of execution.

I'll noodle on if there's a better way to support inter-model-chaining like this, but for now, I think it's best to default to writing more explicit functions even if it takes a few more lines :(. Thanks for getting us thinking about more creative applications!

ibehnam commented 1 month ago

@Harsha-Nori No worries! It was worth a shot. Currently what I do is call each LLM separately, store the response in a name variable, and then add that to the other LLM's state.

Another approach is to unwrap gen like this:

ngen = gen.__wrapped__

And now users can use the lm arg directly in ngen.

slundberg commented 1 month ago

Hi @ibehnam! To add my two cents here:

lm_1 += "What is the capital of France?"
lm_2 += "What is the capital of France?"
lm_2 += "The other model says" + gen(lm=lm_1) + "And I " select(lm=lm_2, ["agree", "don't agree"])

could be written as

lm_1 += "What is the capital of France?"
lm_2 += "What is the capital of France?"
lm_2 += "The other model says" + (lm_1 + gen("out"))["out"] + "And I " + select(["agree", "don't agree"])

Note that the reason we need to "name" the output of the gen call is that otherwise there is no distinction between the prompt and what got generated. Since str(lm_1 + gen()) will be the prompt and the gen combined.

If (lm_1 + gen("out"))["out"] gets too verbose you can always wrap it:

from guidance import set_attribute

def raw_gen(lm, *args, **kwargs):
    if 'name' not in kwargs:
        kwargs['name'] = 'output'
    with set_attribute("echo", False):
        out = (lm + gen(*args, **kwargs))[kwargs['name']]
    return out

(note that set_attribute currently has a bug that is being fixed by #835)

ibehnam commented 1 month ago

@slundberg Thanks for your response. Yes, this is how I do two LLM calls and combine the results. Your other approach of using raw_gen is also good and avoids the tedious (!) task of naming LLM outputs.

Admittedly, part of my problem was due to being new to Guidance's DSL. After spending a few hours doing some real world experiments, the syntax now feels more natural. I wish there was a way to extract the last response of an LLM. Currently, I use lm._state and regex to extract anything that's between |> and <| marks:

def extract_model_generations(text: str) -> str:
    pattern = r'\|>(.*?)<\|'
    matches = re.findall(pattern, text)
    concatenated_text = ''.join(matches)
    return concatenated_text

But this gets ALL of the model output, not just the last one. If there was a way to extract the model outputs as a list or dictionary, I think we could just call (lm_1 + gen()).outputs[-1] or something, avoiding the need to name the LLM's output manually.

slundberg commented 1 month ago

When you start doing interleaved constrained generation the line between input and output gets blurry, so we probably can't keep track of all outputs. However, perhaps it would be good for us to give a default value for the gen call's name arg. Right now it defaults to None, but if we set it to "output" or "last_output" then everything will get a name by default.

ibehnam commented 1 month ago

@slundberg Yes, that'd be great. That way, we could simply do (lm_1 + gen())["last_output"].

On a tangent (and I'm cc'ing @Harsha-Nori as well), since you mentioned interleaved generation: I noticed if we set a regex for gen and specify some fixed parts, the LLM still generates those (as shown in green in the output). Example:

lm + gen(regex=r'^Answer:[^\n]*$', ...)

Shouldn't "Answer:" be interleaved in the model generation? I like how smart Guidance is in realizing if the LLM is going to say "Subjective" or "Objective" when using select(options["Objective", "Subjective"])—as soon as the LLM generates "Objec" or "Subjec", the other part "tive" is interleaved.

So I thought it would pre-emptively put "Answer:" in model output as well.

jtha commented 2 weeks ago

@ibehnam you know this is inspiring me to try collaborative output of multiple models. My thinking is beyond maker-checker interactions, for a multi step generation we can have llm1 and llm2 do the end-to-end steps individually. Then they can remix by swapping after each step and from there we can build a consensus output.

And it doesn't have to be different models either, could be the same model with different hyperparameters/context/instructions.

guidance-ai / guidance

Using two models at once and passing the state of one to another? (more general `gen`, `select`, etc.) #821