langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
95.52k stars 15.51k forks source link

Microsoft Guidance Integration #6142

Closed rmonvfer closed 8 months ago

rmonvfer commented 1 year ago

Feature request

Guidance is a language for controlling large language models developed by Microsoft.

"Guidance allows to interleave generation, prompting, and logical control into a single continuous flow [...] more effectively and efficiently than traditional prompting or chaining"

In practice, this means that Guidance is not only able to force LLMs to provide an specific output format (in a deterministic way) but also enables conditional output, loops and much more, with just a handlebars-like templating language.

For langchain, this means that we would be able to provide formatted outputs with 100% accuracy, improving Agents, Tools and other components that rely heavily on output parsing.

Adding this to langchain still makes sense even with the introduction of functions in the OpenAI models, as this changes only benefit those closed-source models and Guidance also works with open-source ones such as Vicuna.

Motivation

I've been developing a langchain-based product for a while now and one of the biggest pain points for me is the unreliability of the agents output format. Take the ConversationalChatAgent (from here) as an example, its output parsing depends on the model following the `FORMAT_INSTRUCTIONS here.

In my experience, this works pretty well with a low temperature but it's sometimes unreliable nonetheless, breaking the agent execution and causing hard to prevent errors.

Your contribution

I would like to gather some feedback from the community about this integration, I might be approaching this in the wrong way and there might be solutions for this already.

If this is somewhat useful, I would be happy to submit a PR with an initial integration (maybe similar to what Llama-Index has done) for general output parsing. This would allow Guidance to be integrated even further by, for example, replacing the regular Pydantic output parser with Guidance ouput parsers in all relevant situations (it should be a drop-in replacement)

vowelparrot commented 1 year ago

You could also check out the new functions agent in v0.0.200

rmonvfer commented 1 year ago

Sure, they are amazing but I really think Guidance is underated and might be useful as it can be integrated with open-source models, making easy for people to build reliable agents on top of such models and not only proprietary ones.

louisoutin commented 1 year ago

+1 I think it would make sense to have Guidance support in Langchain Prompts. I don't see such any feature similar to "Guidance acceleration" on langchain. I guess shouldn't be too hard to integrate.

rmonvfer commented 1 year ago

Acceleration is just one of the main benefits of using Guidance (the other one is definitely being able to force output formats in a deterministic way).

Do you have any suggestions on how I might get started with this? Maybe create a new type of Prompt and LLM? From the official Guidance repository I get that we might need somewhat different classes (so instead of using a subclass of BasePrompt with an OpenAI LLM, you might use a subclass of BaseGuidancePrompt with an GuidanceOpenAI LLM).

Another potential change might be adding a new OutputParser so that we can "emulate" the OpenAI Functions feature with open-source models (very important IMO).

louisoutin commented 1 year ago

I didn't use guidance yet (just checked their code a bit so far). I'm not sure what's the best way to integrate it to langchain yet. But what you propose make a lot of sense. I guess we need both a BaseGuidancePrompt and BaseGuidanceLLM to be able to to add the guidance modifications to both the prompt class and the llm class. However, if there is a way to only add a BaseGuidancePrompt and modifying the current BaseLanguageModel class on langchain, it would be nice so that we don't have to specifically ask for a guidance model type. But i'm not sure if it's feasible.

vowelparrot commented 1 year ago

I do agree that Guidance is a very useful approach - I think a light touch integration could be very valuable so long as the implementation is correct

DylanBruzenak commented 1 year ago

I do wonder if the overall idea could be taken and run with in a different form. I think the handlebars style language could be a bit more readable, but maybe I should file that over there. Integrating langchain using something like their parse_best example is also interesting.

The things I would like to see most are the multiple gens and the ability more rigorously force output structure (instead of just checking after the fact with something like pydantic and re-trying). Something like a StructuredPrompt with placeholders.

louisoutin commented 1 year ago

Any news on this? I really think that integrating a more efficient engine for text generation that can fill static part (typically for a json answer) would be a really nice feature to have. Either guidance, or outline (cf: https://github.com/normal-computing/outlines) or any other framework that i might have missed. Cheers

rmonvfer commented 1 year ago

Yes, I would say that Outlines is probably a much better option now. We could probably start with an integration with Outlines (let's discuss which parts and which approach might be better) and if the need for an alternative arises we can always integrate Guidance

DylanBruzenak commented 1 year ago

That looks like exactly what I'd want. Maybe a separate issue for that ?

hinthornw commented 1 year ago

Yeah outlines would be fantastic. We have done some experimental integrations with the following previously:

rmonvfer commented 1 year ago

Sure, I'll give it a shot this weekend following the examples you mentioned.

ManuelFay commented 1 year ago

Any updates / PR under work ? The feature sounds very cool !

AaronWard commented 1 year ago

@hinthornw Any update on this proposal? Would love to see this integration - I'm surprised now one has opened a PR already, at least from what i could find

rlouf commented 1 year ago

Author of Outlines here. I can help with the integration 🙂

sandangel commented 12 months ago

Hi, may I ask if there is an update on this issue?

rlouf commented 12 months ago

Waiting for a go from one of the maintainers before opening a PR, but just realised I wasn't asking explicitly 😅

sandangel commented 12 months ago

@rlouf can you share a code snippet, I would like to try out quickly 😀😀