astronomer / ask-astro

An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer
https://ask.astronomer.io/
Apache License 2.0
192 stars 47 forks source link

Add filter to block non Airflow or astro related questions #54

Closed sunank200 closed 9 months ago

sunank200 commented 11 months ago

Message from Slack: Since we are talking about AskAstro I already have a Feedback. I’ve seen that you are using langchain and probably a filter step to block questions non related to Astro. For most of them it is working well (sorry for doing some QA I couldn’t avoid it), but for others it isn’t (check the picture). Make sure to a have throttling limit otherwise someone can exploit the website to use ChatGPT through your services and you can be surprised by the OpenAI bill

More at: https://astronomer.slack.com/archives/C061ZEF3NP9/p1698147033409819?thread_ts=1698146233.559039&cid=C061ZEF3NP9

Lee-W commented 10 months ago

I saw some people ask AI about the Relevance https://github.com/derwiki/llm-prompt-injection-filtering. This might be something we could try

Lee-W commented 10 months ago

or another thing comes to my mind is using https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html#sklearn.decomposition.LatentDirichletAllocation to decide whether the topic is relevant

Lee-W commented 10 months ago

@sunank200 What do you think about these methods? Or do we want to explore more?

sunank200 commented 9 months ago

@pankajkoti have you started exploring various approaches for this task? We have a deadline of Dec 22 for this as in this doc

pankajkoti commented 9 months ago

@sunank200 haven't really got time to start on this yet due to other priorities. I would say this issue would be on risk if 22nd Dec is the deadline as I need to first on-board on Ask Astro wrt local setup and then look into this ticket

pankajkoti commented 9 months ago

@Lee-W helped setting up Ask Astro locally, now. Will explore the codebase next.

pankajkoti commented 9 months ago

Would be nice to check with David if he has some inputs here already

davidgxue commented 9 months ago

I think you guys already have pretty good approaches in mind. I know in the industry, there are two pretty popular open source solutions for LLM guardrails. There are guardrails.ai and Nvidia's guardrails.

The first one is easier to setup and the latter is a bit more complex and not really an out-of-the-box solution, so perhaps we can look into the guardrails.ai first. I am not super familiar with how well this integrates with LangChain, but I think they already have some degrees of integration support based on a quick brief search.

But I think under the hood all the solutions you can find still either use a zero-shot text classifier (or I suppose you can train your own specific to our astro + airflow topic but seems like overkill), use another llm that essentially does zero-shot classification, or some kind of vector embedding and calculating the similarity of the prompt+response to the topics, or some combination of these tools (e.g. thresholding on text classifier + llm as secondary check)

I hope this helps! And feel free to let me know if you want to discuss this further!

pankajkoti commented 9 months ago

I tried the following questions with guardrails locally. Preliminarily, it does not seem to help much. Essentially when a topic is related it should not show validation errors.

I set the valid_topics as ["airflow", "astro"], device = -1, llm_callable="gpt-3.5-turbo", disable_classifier=False, disable_llm=False, on_fail="exception", invalid_topics=[]

Following are the response I got for our sample questionnaire from https://docs.google.com/spreadsheets/d/13cVqNikix82YjCPA4t0XaULg3XccBnvrQUmQa9VwgC0/edit#gid=1762228914

Success -> True Positive

In [29]: text = "Explain the architecture of Airflow."

In [30]: guard.parse(llm_output=text)
Out[30]: ValidationOutcome(raw_llm_output='Explain the architecture of Airflow.', validated_output='Explain the architecture of Airflow.', reask=None, validation_passed=True, error=None)

False negatives

In [17]: text = "how to create Airflow connections?"

In [18]: output.error
Out[18]: 'Validation failed for field with errors: Most relevant topic is other.'

In [19]: guard.parse(llm_output=text)
Out[19]: ValidationOutcome(raw_llm_output='how to create Airflow connections?', validated_output='how to create Airflow connections?', reask=None, validation_passed=True, error=None)

In [20]: text = "what are three common types of tasks in a DAG?"

In [21]: guard.parse(llm_output=text)
Out[21]: ValidationOutcome(raw_llm_output='what are three common types of tasks in a DAG?', validated_output=None, reask=None, validation_passed=False, error='Validation failed for field with errors: Most relevant topic is other.')

In [22]: text = "Help me simplify relationships/dependencies between tasks"

In [23]: guard.parse(llm_output=text)
Out[23]: ValidationOutcome(raw_llm_output='Help me simplify relationships/dependencies between tasks', validated_output=None, reask=None, validation_passed=False, error='Validation failed for field with errors: Most relevant topic is other.')

In [24]: text = "what are custom xcom backends?"

In [25]: guard.parse(llm_output=text)
Out[25]: ValidationOutcome(raw_llm_output='what are custom xcom backends?', validated_output=None, reask=None, validation_passed=False, error='Validation failed for field with errors: Most relevant topic is other.')

In my opinion, we might have to build an extensive list of keywords specific to Astro & Airflow to be part of valid_topics for it to not give false negatives, but the more number of keywords we add we could also have some false positives.

davidgxue commented 9 months ago

Update: I did a quick sync on slack with Pankaj about his testing method and then ran some experiments on my side. Here are some updates I can provide.

A few things that will significantly improve the accuracy

  1. add model_threshold, parameter to the OnTopic() object

    • You may need to play around with this value. I suggest something higher like 0.7 or above, perhaps even 0.8 or 0.9. If you don't pass in this parameter, guardrails will default to 0.5.
    • Since the zero-shot text classifier is always weaker than gpt-3.5, we want to only invalidate this response if the classifier is very confident that this is not relevant, not when it's having a 50% confidence. And if it isn't very confident, we should feed this into gpt-3.5 to do a better verification.
  2. Add more valid topics

    • I agree with your conclusion that we probably want to add a few more valid topics, but just not a crap ton of them. I think something like astro, Astro,astronomer, Astronomer,Airflow, airflow (case sensitivity seems to matter a little here) would probably be a good starting point. I am not familiar with airflow enough at the moment to add more keywords, but if there are something specific like xcom maybe that is related would be a good add too. But note that we don't want to do too many that make it too broad.

Quick overview of the ensemble method

Implications

  1. This means that we would still need to call our GPT-4 and other LLMs even if the conversation is off-topic at least once, potentially cost money with that one wasted call, and generate a response first before we run validation and decide whether this Q/A is on-topic. However, I think if we effectively block people from doing off-topic discussions and with proper rate-limits, then bad actors would not be likely to continuously spam our chatbot with off-topic prompts since they would just get invalidated. If the main goal is so that people won't use our Ask Astro as a free GPT-4 for their own needs, this implementation should still help.
  2. Additional gpt-3.5 calls inside guardrails to validate the response would also increase the cost.
  3. Potential increase in latency: zero-shot text classifier is fast, but the second call to verify using gpt-3.5 will add some latency.

Additional options to explore

pankajkoti commented 9 months ago

hi @davidgxue thanks a lot for your suggestions and the try.

@vatsrahul1001 and I tried some more testing, and I feel it still may not offer much help. I have the drafted the finding in the notion doc https://www.notion.so/astronomerio/Guardrails-AI-findings-for-Ask-Astro-aa5d65d5006b4307a055aedf47306ab8

My proposal is as below: We could follow our question_answer pipeline given a prompt. Once our Ask Astro response is ready, we can scan it and do a simple python text search to verify whether it contains the words in a pre-built list of topics based on our docs we store to validate the relevance.

cc: @sunank200 @phanikumv

davidgxue commented 9 months ago

Hey I left some quick comments on the notion doc. I agree with your overall conclusion. We eventually probably want our own custom finetuned classifier to do this since zero shot classifiers + LLMs aren't doing well in this case (bootstrapped by guardrails).

I am on the fence about next steps though (guard rails classifier only + exhaustive topics list + high model threshold vs keywords search you proposed). If you want to do keyword search without a model, wouldn't still have the issue of you having to have a large comprehensive list of valid keywords, with the only downside of having no way to do model thresholding to be less false invalid prone?

pankajkoti commented 9 months ago

Following the continued discussion on the Notion document and the Slack conversation in https://astronomer.slack.com/archives/C05QJA9LTR9/p1703159886303619, as well as a collaborative sync-up with David where we executed the outlined next steps from David's previous comment above, it has been observed that the model threshold in the library is slightly misaligned. As per the consensus reached with David, it appears that the current library doesn't provide significant assistance in addressing this issue. Consequently, my plan is to move forward with the implementation of keyword-based search validation.

pankajkoti commented 9 months ago

Based on Steven's comment, we're deciding to not go ahead with the Keyword search. Steven is going instead add “build an Airflow/Astro classification model” to the roadmap.

Additionally, it has been suggested to check if we can throttle the requests and I have created an issue for this.

Based on the above, I am closing this ticket.