Closed serhatgktp closed 9 months ago
@serhatgktp I was following the Guard LLM wrapper and output parser examples but didn't find any success. The results to validated_response
and output_parser.parse(output)
were usually None
. I was wondering if this is because the guardrail wasn't able to pick up on the structure of the open-source LLM response.
Hey, you can use guardrails with open source models using Manifest.
See here for an example: https://shreyar.github.io/guardrails/llm_api_wrappers/#using-manifest
Also related to parsing the response: https://github.com/ShreyaR/guardrails/issues/210#issuecomment-1656225464
Hello @serhatgktp , thanks for opening this issue! This is an important observation. Currently, most of the Guardrails functionality heavily depends on how good the instruction following ability of LLMs is. Guardrails supports validation of both structured and unstructured data. Currently, OpenAI, Cohere or Anthropic models are really good at following instructions and hence generate really good structured data in the first place - on top of which our validators run their magic and validate the output.
As you currently found out, OpenAI's models seem to understand guardrails instructions very accurately, whereas many open source LLMs aren't able to precisely follow the requested format. We tried testing guardrails with the following open-source models:
For use cases involving generation of structured data (such as the examples shared by @irgolic), either these models did not follow instructions correctly - such as not able to return the output in the JSON format correctly or were incredibly slow.
On the other hand though, all of these models work quite well when use-cases involve generating unstructured data! Here's an example notebook for you to test all of these models on a simple unstructured data generation task: Google Colab Link
To recap, all our validators kick only after schema validation is satisfied on the raw LLM outputs. That is the reason for None being returned for @QUANGLEA. For unstructured data, that is not an issue - as our validators can directly kick in. Please try experimenting with different open-source LLMs on any unstructured data task for now (follow the notebook shared above), and it should work. Also try with more SOTA instruction-following LLMs, you can expect to see better results.
As far as structured data tasks are concerned, you could try working on the first 2 solutions that you described. Meanwhile, we will also try on our end, if there's anything that we can do.
Thank you for your detailed response @thekaranacharya - it always helps to know the current capabilities and shortcomings. Once (hopefully) open-source models achieve the degree of specificity in their answers as OpenAI's products, I feel it will relieve countless industries incorporating AI from relying solely on one product.
It certainly is a non-trivial task, however, to parse different LLM outputs deterministically. While the models themselves are technically deterministic, I feel an engineer unfamiliar with the internal statistics of the respective model may only try to achieve this with educated guesses and trial & error, which is costly in time.
Until then, best of luck to all of us in finding the solution. Cheers!
Hi @thekaranacharya , I would like to add to this that many custom LLM applications now work with open source finetuned models. These models have the ability to correctly format the output, as in OpenAIs models (if they are properly tuned).
Having the support to run our custom models to extract structured data would be a great contribution to the project, as it would open up a huge range of possibilities.
Hello @Psancs05 , if your open-source fine-tuned custom models do follow instructions nicely, then as of this moment as well you can use it with guardrails! All they need to do is accept a prompt (a string) and return a response (a string). Please use this notebook as a starting point to add your custom model logic inside the my_llm_api()
function. That function is supposed to host the logic for whichever LLM inference you're using.
Continue the conversation here if you run into any problems while doing that.
Hello @Psancs05 , if your open-source fine-tuned custom models do follow instructions nicely, then as of this moment as well you can use it with guardrails! All they need to do is accept a prompt (a string) and return a response (a string). Please use this notebook as a starting point to add your custom model logic inside the
my_llm_api()
function. That function is supposed to host the logic for whichever LLM inference you're using.Continue the conversation here if you run into any problems while doing that.
Thanks for sharing the notebook, i tweeked the code and running in problem, can you please check my notebook: https://colab.research.google.com/drive/1_LmvjcEDRWpkvmcqB3HdHTZt4CjZv4sC?usp=sharing
Hello @Psancs05 , if your open-source fine-tuned custom models do follow instructions nicely, then as of this moment as well you can use it with guardrails! All they need to do is accept a prompt (a string) and return a response (a string). Please use this notebook as a starting point to add your custom model logic inside the
my_llm_api()
function. That function is supposed to host the logic for whichever LLM inference you're using.Continue the conversation here if you run into any problems while doing that.
This is part of breaking changes from 0.3 https://www.guardrailsai.com/docs/migration_guides/0-3-migration
Please change line 2 to
raw_output, validated_output, *rest = guard(my_llm_api)
Guardrails seemingly supports other LLM providers over manifest, and even with custom LLM APIs. However, I haven't been able to use guardrails with a single open-source LLM in practice. I have tried the following models/providers:
Hugging Face Hub
GPT4All
I understand that guardrails expects a JSON response that conforms to a certain format, and usually crashes when the LLM output deviates from the anticipated format. OpenAI's models seem to understand guardrails instructions very accurately, whereas many open source LLMs (at least all the ones I've tried so far) aren't able to precisely follow the requested format.
Initially, there are a few workarounds that I can think of:
I believe option 3 is the default for now, but I'm wondering whether we could come up with ways to use guardrails with open-source LLMs before then.
P.S.: If anyone has successfully used guardrails with an open-source LLM, could you please provide details? Thanks!