langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
84.1k stars 12.97k forks source link

fetching inaccurate answers from the database #16491

Closed jyoti194 closed 4 days ago

jyoti194 commented 3 months ago

Checked other resources

Example Code

import sqlalchemy as sal import os, sys, openai import pandas as pd import constants from langchain.sql_database import SQLDatabase from langchain.llms.openai import OpenAI from langchain_experimental.sql import SQLDatabaseChain from sqlalchemy import create_engine

from langchain.chat_models import ChatOpenAI from typing import List, Optional

from langchain.agents.agent_toolkits import SQLDatabaseToolkit from langchain.callbacks.manager import CallbackManagerForToolRun from langchain.chat_models import ChatOpenAI from langchain_experimental.plan_and_execute import ( PlanAndExecute, load_agent_executor, load_chat_planner, ) from langchain.sql_database import SQLDatabase from langchain.text_splitter import TokenTextSplitter from langchain.tools import BaseTool from langchain.tools.sql_database.tool import QuerySQLDataBaseTool from secret_key import openapi_key from langchain.prompts import PromptTemplate

os.environ['OPENAI_API_KEY'] = openapi_key

from urllib.parse import quote_plus server_name = constants.server_name database_name = constants.database_name username = constants.username password = constants.password encoded_password = quote_plus(password) connection_uri = f"mssql+pyodbc://{username}:{encoded_password}@{server_name}/{database_name}?driver=ODBC+Driver+17+for+SQL+Server"

engine = create_engine(connection_uri)

def chat(question, sql_format):

# greetings = ["hi", "hello", "hey"]

# if question.lower() in greetings:
#     return "Hello! How can I assist you today?"

model_name="gpt-3.5-turbo-16k"
db = SQLDatabase(engine, view_support=True, include_tables=['EGV_emp_departments_ChatGPT'])
llm = ChatOpenAI(temperature=0, verbose=False, model=model_name)

PROMPT = """ 
Given an input question, first create a syntactically correct mssql query to run,  
then look at the results of the query and return the answer.
Return the answer in a sentence form.  
The question: {question}
"""
prompt_template = """
Use the following pieces of context to answer the question at the end. If you don't know the answer, 
just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
Helpful Answer:"""

answer = None

if sql_format==False:

    db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)

    answer = db_chain.run(PROMPT.format(question=question))

else:
    db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True , return_sql =True)

    sql_query = db_chain.run(question) 
    print("SQLQuery: "+str(sql_query))
    # result = engine.execute(sql_query)

    result_df = pd.read_sql(sql_query, engine)
    if result_df.empty:
        return "No results found"

    answer = result_df.to_dict()

def handle_greetings(question):
    greetings = ["hi", "hello", "hey"]  
    if any(greeting in question.lower() for greeting in greetings):
        return "Hello! How can I assist you today?"
    else:
        return None

PROMPT = PromptTemplate(
    template=prompt_template, 
    input_variables=[question],
    preprocessor=handle_greetings(question)
)

def split_text(text, chunk_size, chunk_overlap=0):
    text_splitter = TokenTextSplitter(
        chunk_size=chunk_size, chunk_overlap=chunk_overlap
    )
    yield from text_splitter.split_text(text)

class QuerySQLDatabaseTool2(QuerySQLDataBaseTool):
    def _run(
        self,
        query: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        result = self.db.run_no_throw(query)

        return next(split_text(result, chunk_size=14_000))

class SQLDatabaseToolkit2(SQLDatabaseToolkit):
    def get_tools(self) -> List[BaseTool]:
        tools = super().get_tools()
        original_query_tool_description = tools[0].description
        new_query_tool = QuerySQLDatabaseTool2(
            db=self.db, description=original_query_tool_description
        )
        tools[0] = new_query_tool

        return tools

# return db_chain.run(question) 
return answer

def chain1(question): text = chat(question,False)

return text

def chain2(question): query = chat(question,True)

return query

answer=chain1("what is the uan number for AD#######")

print(answer)

Description

in the chatbot which is connected to the db, when i'm asking question like give UAN number which is not present in the db, instead its fetching euid number, if some particular data is not there it should written as invalid question, it should not execute a wrong queary.

Answer:I'm sorry, but I cannot answer the question "hi" as it is not a valid question. Please provide a specific question related to the data in the table.

Finished chain. I'm sorry, but I cannot answer the question "hi" as it is not a valid question. Please provide a specific question related to the data in the table.

Entering new SQLDatabaseChain chain...

Given an input question, first create a syntactically correct mssql query to run,  
then look at the results of the query and return the answer.
Return the answer in a sentence form.  
The question: what is the uan number for AD23010923

SQLQuery:SELECT [EmployeeID], [EmployeeName], [EmployeeNameAsPerBank], [EmployeeEuid] FROM [EGV_emp_departments_ChatGPT] WHERE [EmployeeID] = 'AD23010923' SQLResult: [('AD########', '######', 'S######## P', Decimal('######'))] Answer:The UAN number for AD####### is #####.

Finished chain. The UAN number for AD###### is ####.

how to validate each answer before we get the output

for this can we modify the basy.py file in langchain_experimental.sql.base.py, as it seem to be fetching from the db

"""Chain for interacting with SQL Database.""" from future import annotations

import warnings from typing import Any, Dict, List, Optional

from langchain.callbacks.manager import CallbackManagerForChainRun from langchain.chains.base import Chain from langchain.chains.llm import LLMChain from langchain.chains.sql_database.prompt import DECIDER_PROMPT, PROMPT, SQL_PROMPTS from langchain.prompts.prompt import PromptTemplate from langchain.schema import BasePromptTemplate from langchain.schema.language_model import BaseLanguageModel from langchain.tools.sql_database.prompt import QUERY_CHECKER from langchain.utilities.sql_database import SQLDatabase

from langchain_experimental.pydantic_v1 import Extra, Field, root_validator

INTERMEDIATE_STEPS_KEY = "intermediate_steps"

class SQLDatabaseChain(Chain): """Chain for interacting with SQL Database.

Example:
    .. code-block:: python

        from langchain_experimental.sql import SQLDatabaseChain
        from langchain.llms import OpenAI, SQLDatabase
        db = SQLDatabase(...)
        db_chain = SQLDatabaseChain.from_llm(OpenAI(), db)

*Security note*: Make sure that the database connection uses credentials
    that are narrowly-scoped to only include the permissions this chain needs.
    Failure to do so may result in data corruption or loss, since this chain may
    attempt commands like `DROP TABLE` or `INSERT` if appropriately prompted.
    The best way to guard against such negative outcomes is to (as appropriate)
    limit the permissions granted to the credentials used with this chain.
    This issue shows an example negative outcome if these steps are not taken:
    https://github.com/langchain-ai/langchain/issues/5923
"""

llm_chain: LLMChain
llm: Optional[BaseLanguageModel] = None
"""[Deprecated] LLM wrapper to use."""
database: SQLDatabase = Field(exclude=True)
"""SQL Database to connect to."""
prompt: Optional[BasePromptTemplate] = None
"""[Deprecated] Prompt to use to translate natural language to SQL."""
top_k: int = float('inf')
"""Number of results to return from the query"""
input_key: str = "query"  #: :meta private:
output_key: str = "result"  #: :meta private:
return_sql: bool = False
"""Will return sql-command directly without executing it"""
return_intermediate_steps: bool = False 
"""Whether or not to return the intermediate steps along with the final answer."""
return_direct: bool = False
"""Whether or not to return the result of querying the SQL table directly."""
use_query_checker: bool = False
"""Whether or not the query checker tool should be used to attempt
to fix the initial SQL from the LLM."""
query_checker_prompt: Optional[BasePromptTemplate] = None
"""The prompt template that should be used by the query checker"""

class Config:
    """Configuration for this pydantic object."""

    extra = Extra.forbid
    arbitrary_types_allowed = True

@root_validator(pre=True)
def raise_deprecation(cls, values: Dict) -> Dict:
    if "llm" in values:
        warnings.warn(
            "Directly instantiating an SQLDatabaseChain with an llm is deprecated. "
            "Please instantiate with llm_chain argument or using the from_llm "
            "class method."
        )
        if "llm_chain" not in values and values["llm"] is not None:
            database = values["database"]
            prompt = values.get("prompt") or SQL_PROMPTS.get(
                database.dialect, PROMPT
            )
            values["llm_chain"] = LLMChain(llm=values["llm"], prompt=prompt)
    return values

@property
def input_keys(self) -> List[str]:
    """Return the singular input key.

    :meta private:
    """
    return [self.input_key]

@property
def output_keys(self) -> List[str]:
    """Return the singular output key.

    :meta private:
    """
    if not self.return_intermediate_steps:
        return [self.output_key]
    else:
        return [self.output_key, INTERMEDIATE_STEPS_KEY]

def _call(
    self,
    inputs: Dict[str, Any],
    run_manager: Optional[CallbackManagerForChainRun] = None,
) -> Dict[str, Any]:
    _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
    input_text = f"{inputs[self.input_key]}\nSQLQuery:"
    print("SQLQuery")
    _run_manager.on_text(input_text, verbose=self.verbose)
    # If not present, then defaults to None which is all tables.
    table_names_to_use = inputs.get("table_names_to_use")
    table_info = self.database.get_table_info(table_names=table_names_to_use)
    llm_inputs = { 
        "input": input_text, 
        "top_k": str(self.top_k),
        "dialect": self.database.dialect,
        "table_info": table_info,
        "stop": ["\nSQLResult:"],
    }
    if self.memory is not None:
        for k in self.memory.memory_variables:
            llm_inputs[k] = inputs[k]
    intermediate_steps: List = []
    try:
        intermediate_steps.append(llm_inputs.copy())  # input: sql generation
        sql_cmd = self.llm_chain.predict(
            callbacks=_run_manager.get_child(),
            **llm_inputs,
        ).strip()
        if self.return_sql:
            return {self.output_key: sql_cmd}
        if not self.use_query_checker:
            _run_manager.on_text(sql_cmd, color="green", verbose=self.verbose)
            intermediate_steps.append(
                sql_cmd
            )  # output: sql generation (no checker)
            intermediate_steps.append({"sql_cmd": sql_cmd})  # input: sql exec
            print(sql_cmd)
            result = self.database.run(sql_cmd)
            print(result)
            intermediate_steps.append(str(result))  # output: sql exec
        else:
            query_checker_prompt = self.query_checker_prompt or PromptTemplate(
                template=QUERY_CHECKER, input_variables=["query", "dialect"]
            )
            query_checker_chain = LLMChain(
                llm=self.llm_chain.llm, prompt=query_checker_prompt
            )
            query_checker_inputs = {
                "query": sql_cmd,
                "dialect": self.database.dialect,
            }
            checked_sql_command: str = query_checker_chain.predict(
                callbacks=_run_manager.get_child(), **query_checker_inputs
            ).strip()
            intermediate_steps.append(
                checked_sql_command
            )  # output: sql generation (checker)
            _run_manager.on_text(
                checked_sql_command, color="green", verbose=self.verbose
            )
            intermediate_steps.append(
                {"sql_cmd": checked_sql_command}
            )  # input: sql exec
            result = self.database.run(checked_sql_command)
            intermediate_steps.append(str(result))  # output: sql exec
            sql_cmd = checked_sql_command

        _run_manager.on_text("\nSQLResult: ", verbose=self.verbose)
        _run_manager.on_text(result, color="yellow", verbose=self.verbose)
        # If return direct, we just set the final result equal to
        # the result of the sql query result, otherwise try to get a human readable
        # final answer
        if self.return_direct:
            final_result = result
        else:

            _run_manager.on_text("\nAnswer:", verbose=self.verbose)
            # if result:
            #     input_text += f"{sql_cmd}\nSQLResult: {result}\nAnswer:"
            # else:
            #     input_text += f"{sql_cmd}\nSQLResult: {result}\nAnswer: {'No result found' if not result else ''}"

            input_text += f"{sql_cmd}\nSQLResult: {result}\nAnswer:"
            llm_inputs["input"] = input_text
            intermediate_steps.append(llm_inputs.copy())  # input: final answer
            final_result = self.llm_chain.predict(
                callbacks=_run_manager.get_child(),
                **llm_inputs,
            ).strip()
            # print("------", result)
            if not result:
                final_result = 'Invalid Question'
            # print("....",final_result)
            intermediate_steps.append(final_result)  # output: final answer
            _run_manager.on_text(final_result, color="green", verbose=self.verbose)
        chain_result: Dict[str, Any] = {self.output_key: final_result}
        if self.return_intermediate_steps:
            chain_result[INTERMEDIATE_STEPS_KEY] = intermediate_steps
        print("----"+str(chain_result)+"-----")
        return chain_result
    except Exception as exc:
        # Append intermediate steps to exception, to aid in logging and later
        # improvement of few shot prompt seeds
        exc.intermediate_steps = intermediate_steps  # type: ignore
        raise exc

@property
def _chain_type(self) -> str:
    return "sql_database_chain"

@classmethod
def from_llm(
    cls,
    llm: BaseLanguageModel,
    db: SQLDatabase,
    prompt: Optional[BasePromptTemplate] = None,
    **kwargs: Any,
) -> SQLDatabaseChain:
    """Create a SQLDatabaseChain from an LLM and a database connection.

    *Security note*: Make sure that the database connection uses credentials
        that are narrowly-scoped to only include the permissions this chain needs.
        Failure to do so may result in data corruption or loss, since this chain may
        attempt commands like `DROP TABLE` or `INSERT` if appropriately prompted.
        The best way to guard against such negative outcomes is to (as appropriate)
        limit the permissions granted to the credentials used with this chain.
        This issue shows an example negative outcome if these steps are not taken:
        https://github.com/langchain-ai/langchain/issues/5923
    """
    prompt = prompt or SQL_PROMPTS.get(db.dialect, PROMPT)
    llm_chain = LLMChain(llm=llm, prompt=prompt)
    return cls(llm_chain=llm_chain, database=db, **kwargs)

class SQLDatabaseSequentialChain(Chain): """Chain for querying SQL database that is a sequential chain.

The chain is as follows:
1. Based on the query, determine which tables to use.
2. Based on those tables, call the normal SQL database chain.
3. Don't consider the table which are not mentoned, if no result is matching with the keyword Please return the answer as invalid question 

This is useful in cases where the number of tables in the database is large.
"""

decider_chain: LLMChain
sql_chain: SQLDatabaseChain
input_key: str = "query"  #: :meta private:
output_key: str = "result"  #: :meta private:
return_intermediate_steps: bool = False

@classmethod
def from_llm(
    cls,
    llm: BaseLanguageModel,
    db: SQLDatabase,
    query_prompt: BasePromptTemplate = PROMPT,
    decider_prompt: BasePromptTemplate = DECIDER_PROMPT,
    **kwargs: Any,
) -> SQLDatabaseSequentialChain:
    """Load the necessary chains."""
    sql_chain = SQLDatabaseChain.from_llm(llm, db, prompt=query_prompt, **kwargs)
    decider_chain = LLMChain(
        llm=llm, prompt=decider_prompt, output_key="table_names"
    )
    return cls(sql_chain=sql_chain, decider_chain=decider_chain, **kwargs)

@property
def input_keys(self) -> List[str]:
    """Return the singular input key.

    :meta private:
    """
    return [self.input_key]

@property
def output_keys(self) -> List[str]:
    """Return the singular output key.

    :meta private:
    """
    if not self.return_intermediate_steps:
        return [self.output_key]
    else:
        return [self.output_key, INTERMEDIATE_STEPS_KEY]

def _call(
    self,
    inputs: Dict[str, Any],
    run_manager: Optional[CallbackManagerForChainRun] = None,
) -> Dict[str, Any]:
    _run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
    _table_names = self.sql_chain.database.get_usable_table_names()
    table_names = ", ".join(_table_names)
    llm_inputs = {
        "query": inputs[self.input_key],
        "table_names": table_names,
    }
    _lowercased_table_names = [name.lower() for name in _table_names]
    table_names_from_chain = self.decider_chain.predict_and_parse(**llm_inputs)
    table_names_to_use = [
        name
        for name in table_names_from_chain
        if name.lower() in _lowercased_table_names
    ]
    _run_manager.on_text("Table names to use:", end="\n", verbose=self.verbose)
    _run_manager.on_text(
        str(table_names_to_use), color="yellow", verbose=self.verbose
    )
    new_inputs = {
        self.sql_chain.input_key: inputs[self.input_key],
        "table_names_to_use": table_names_to_use,
    }
    return self.sql_chain(
        new_inputs, callbacks=_run_manager.get_child(), return_only_outputs=True
    )

@property
def _chain_type(self) -> str:
    return "sql_database_sequential_chain"

System Info

python: 3.11 langchain: latest

Related Components

dosubot[bot] commented 3 months ago

🤖

It's great that you've identified the issue and are considering modifying the base.py file to address it. Your proposed solution sounds like a step in the right direction. If you decide to make the modification, feel free to create a pull request with your changes. Your contribution would be valuable to the community. Keep up the good work!


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jyoti194 commented 3 months ago

@dosubot

what about in the chatbot which is connected to the db, when i'm asking question like give UAN number which is not present in the db, instead its fetching euid number, if some particular data is not there it should written as invalid question, it should not execute a wrong query. how to do that

dosubot[bot] commented 3 months ago

🤖

Based on the information you've provided, it seems that the current logic in the langchain_experimental.sql.base.py file does not explicitly handle the case where a query does not return a result. Instead, it raises an exception if an error occurs during the execution of the SQL query.

To achieve your desired behavior, you could modify the _call method of the SQLDatabaseChain class to check if the result of the query is None or an empty set, and if so, return a specific message like "invalid question". Here's a simple example of how you could implement this:

try:
    ...
    result = self.database.run(sql_cmd)
    if not result:
        return "invalid question"
    ...
except Exception as exc:
    # Append intermediate steps to exception, to aid in logging and later
    # improvement of few shot prompt seeds
    exc.intermediate_steps = intermediate_steps  # type: ignore
    raise exc

This code will return "invalid question" if the run method returns None or an empty set, indicating that the query did not return a result.

Please note that this is a simple example and may not cover all edge cases. You might need to adjust the code to fit your specific use case and handle other potential issues that could arise.

As for the langchain.sql_database file, I wasn't able to find specific information about how it handles queries that do not return a result. If you could provide more details about the logic implemented in this file, I might be able to give a more precise answer.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jyoti194 commented 3 months ago

@dosubot but here its not returning the none, instead it executing the query taking other parameters and giving an incorrect answer

dosubot[bot] commented 3 months ago

🤖

Based on the context provided, it seems like you want to modify the SQLDatabaseChain class to return an invalid question message when the requested data isn't present in the database. Here's how you can achieve this:

Firstly, you need to add a check after the SQL query is executed to see if the result is empty. If it is, you can return a custom message indicating that the requested data isn't present in the database. This can be done by modifying the _call method of the SQLDatabaseChain class.

Here's the modified _call method:

def _call(
        self,
        inputs: Dict[str, Any],
        run_manager: Optional[CallbackManagerForChainRun] = None,
    ) -> Dict[str, Any]:
        # ... existing code ...

        # Execute the SQL command
        result = self.database.run(sql_cmd)

        # Check if the result is empty
        if not result:
            return {self.output_key: "Invalid question: Requested data isn't present in the database"}

        # ... existing code ...

This modification checks if the result is empty after the SQL command is executed. If it is, it returns a dictionary with the output_key and a custom message indicating that the requested data isn't present in the database. This will prevent the method from returning an EUID number when a non-existent UAN number query is made.

You can find the SQLDatabaseChain class in the langchain_experimental.sql.base.py file in the LangChain repository. The class is a subclass of the Chain class and is designed to interact with SQL databases. It includes several attributes and methods, including the _call method that you need to modify.

Please note that this modification should be done carefully to avoid breaking other parts of the code that rely on the SQLDatabaseChain class. It's also recommended to write tests for this new behavior to ensure it works as expected.

You can refer to the LangChain repository for more details about the SQLDatabaseChain class and its methods.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jyoti194 commented 3 months ago

@dosubot how can i change the call function to stope it from executing the query when the particular data is not present, or do i need to modify the prompt.py or vector_sql.py files in langchain_experimental.sql, how to tell model it should only fetch the relevant data not the similar looking data if some data is not present

dosubot[bot] commented 3 months ago

🤖

Based on the information you've provided, it seems like the issue lies in how the LangChain framework handles queries that do not return any results. From the context provided, it appears that LangChain does not automatically fetch similar data types when a specific query doesn't return any results. Instead, it returns a default message indicating that the answer is not known.

However, in your case, it's fetching an EUID number instead of a UAN number, which suggests that the query might be incorrectly formulated or the data handling process might be flawed.

To address this, you could modify the QuerySQLDataBaseTool class in the langchain.tools.sql_database.tool module. This class is responsible for executing SQL queries and returning the results. You could add a check here to verify if the query returned any results and if not, return an "invalid question" message.

Here's a rough example of how you could modify the QuerySQLDataBaseTool class:

class QuerySQLDataBaseTool(BaseTool):
    ...
    def call(self, *args, **kwargs):
        result = super().call(*args, **kwargs)
        if not result:
            return "Invalid question"
        return result

This is a simplified example and you might need to adjust it to fit your specific use case.

Also, please ensure that your SQL queries are correctly formulated and that they are specifically looking for UAN numbers. If the query is too broad or not specific enough, it might return incorrect results like EUID numbers.

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jyoti194 commented 3 months ago

@dosubot this is the QuerySQLDataBaseTool in sql_database.tool.py, how to modify it to execute the correct query

class QuerySQLDataBaseTool(BaseSQLDatabaseTool, BaseTool): """Tool for querying a SQL database."""

name: str = "sql_db_query"
description: str = """
Input to this tool is a detailed and correct SQL query, output is a result from the database.
If the query is not correct, an error message will be returned.
If an error is returned, rewrite the query, check the query, and try again.
"""

def _run(
    self,
    query: str,
    run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
    """Execute the query, return the results or an error message."""
    return self.db.run_no_throw(query)
dosubot[bot] commented 3 months ago

🤖

Based on your description, it seems like you want to modify the QuerySQLDataBaseTool class to return an "invalid question" message when a specific UAN number is not found in the database. Here's how you can modify the _run method in the QuerySQLDataBaseTool class:

class QuerySQLDataBaseTool(BaseSQLDatabaseTool, BaseTool):
    """Tool for querying a SQL database."""

    name: str = "sql_db_query"
    description: str = """
    Input to this tool is a detailed and correct SQL query, output is a result from the database.
    If the query is not correct, an error message will be returned.
    If an error is returned, rewrite the query, check the query, and try again.
    """

    def _run(
        self,
        query: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        """Execute the query, return the results or an error message."""
        result = self.db.run_no_throw(query)
        if not result:
            return "Invalid question: UAN number not found in the database."
        return result

In this modification, self.db.run_no_throw(query) is expected to return None or an empty result when the UAN number is not found in the database. If it does, the method returns the "invalid question" message. If it returns a result, the result is returned as usual.

Please note that this modification assumes that self.db.run_no_throw(query) returns None or an empty result when the UAN number is not found. If it behaves differently, you might need to adjust the condition if not result: accordingly.

As for your question about modifying prompt.py or vector_sql.py in langchain_experimental.sql, it's hard to say without seeing the specific code in these files. However, the modification to QuerySQLDataBaseTool should be sufficient to handle the case where a UAN number is not found in the database.

If you continue to experience issues, please provide more details about the behavior of your chatbot and the specific code in prompt.py and vector_sql.py.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jyoti194 commented 3 months ago

@dosubot

no, it still taking eudi number

Entering new SQLDatabaseChain chain...

Given an input question, first create a syntactically correct mssql query to run,
then look at the results of the query and return the answer.
Return the answer in a sentence form.
Recheck the question to see the specific detailed asked and if the specific detail is not available in database, do not execuate the incorrect query to give output.
The question: what is the uan number for AD23010923

SQLQuery:SELECT [EmployeeID], [EmployeeName], [EmployeeEuid] FROM [EGV_emp_departments_ChatGPT] WHERE [EmployeeID] = 'AD23010923' SQLResult: [('AD23010923', 'Suprashwanthan Panneer Selvan', Decimal('7656'))] Answer:The UAN number for AD23010923 is 7656.

Finished chain. The UAN number for AD23010923 is 7656.

dosubot[bot] commented 3 months ago

🤖

It looks like you're encountering an issue with the LangChain framework. However, as the current version of LangChain (v0.1.3) doesn't support SQL database interactions directly, it seems that there might be some confusion or miscommunication.

If you have any questions or need assistance with the LangChain framework, feel free to ask and I'll do my best to help you out!


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jyoti194 commented 3 months ago

@dosubot how to resolve it

Entering new SQLDatabaseChain chain... SQLQuery

Given an input question, first create a syntactically correct mssql query to run,  
then look at the results of the query and return the answer.
While writing the query do not take the column name which is not present,
if column name is not present then return the result as Invalid column name ' '.
Return the answer in a sentence form.

The question: what is the uan number for employee id AD23010923

SQLQuery:SELECT [EmployeeID], [UAN Number] FROM [EGV_emp_departments_ChatGPT] WHERE [EmployeeID] = 'AD23010923'SELECT [EmployeeID], [UAN Number] FROM [EGV_emp_departments_ChatGPT] WHERE [EmployeeID] = 'AD23010923' Traceback (most recent call last): File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\base.py", line 1969, in _exec_single_context self.dialect.do_execute( File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\default.py", line 922, in do_execute cursor.execute(statement, parameters) pyodbc.ProgrammingError: ('42S22', "[42S22] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid column name 'UAN Number'. (207) (SQLExecDirectW)")

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "c:\Users\rndbcpsoft\OneDrive\Desktop\test\main7.py", line 340, in result= chat1("what is the uan number for employee id AD23010923") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "c:\Users\rndbcpsoft\OneDrive\Desktop\test\main7.py", line 121, in chat1 answer = db_chain.run(PROMPT.format(question=question)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\langchain\chains\base.py", line 505, in run return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\langchain\chains\base.py", line 310, in call raise e File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\langchain\chains\base.py", line 304, in call self._call(inputs, run_manager=run_manager) File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\langchain_experimental\sql\base.py", line 214, in _call raise exc File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\langchain_experimental\sql\base.py", line 147, in _call result = self.database.run(sql_cmd) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\langchain\utilities\sql_database.py", line 433, in run result = self._execute(command, fetch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\langchain\utilities\sql_database.py", line 411, in _execute cursor = connection.execute(text(command)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\base.py", line 1416, in execute return meth( ^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\sql\elements.py", line 516, in _execute_on_connection return connection._execute_clauseelement( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\base.py", line 1639, in _execute_clauseelement ret = self._execute_context( ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\base.py", line 1848, in _execute_context return self._exec_single_context( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\base.py", line 1988, in _exec_single_context self._handle_dbapi_exception( File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\base.py", line 2343, in _handle_dbapi_exception raise sqlalchemy_exception.with_traceback(exc_info[2]) from e File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\base.py", line 1969, in _exec_single_context self.dialect.do_execute( File "C:\Users\rndbcpsoft\OneDrive\Desktop\test\envtest\Lib\site-packages\sqlalchemy\engine\default.py", line 922, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42S22', "[42S22] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid column name 'UAN Number'. (207) (SQLExecDirectW)") [SQL: SELECT [EmployeeID], [UAN Number] FROM [EGV_emp_departments_ChatGPT] WHERE [EmployeeID] = 'AD23010923'] (Background on this error at: https://sqlalche.me/e/20/f405)

dosubot[bot] commented 3 months ago

🤖

Based on the context provided, it seems like you want to prevent the model from executing a query if a specific data (in this case, a UAN number) isn't present in the database. To achieve this, you can modify the _get_relevant_documents and _aget_relevant_documents methods in the SelfQueryRetriever class.

You can add a check to see if the structured_query contains the specific data before executing the query. If the UAN number is not present, you can return an empty list or a custom message indicating that the data is not present.

Here is how you can modify the _get_relevant_documents method:

def _get_relevant_documents(
    self, query: str, *, run_manager: CallbackManagerForRetrieverRun
) -> List[Document]:
    """Get documents relevant for a query.

    Args:
        query: string to find relevant documents for

    Returns:
        List of relevant documents
    """
    structured_query = self.query_constructor.invoke(
        {"query": query}, config={"callbacks": run_manager.get_child()}
    )
    if self.verbose:
        logger.info(f"Generated Query: {structured_query}")

    # Check if the UAN number is present in the structured_query
    if 'UAN' not in structured_query:
        return []  # or return a custom message

    new_query, search_kwargs = self._prepare_query(query, structured_query)
    docs = self._get_docs_with_query(new_query, search_kwargs)
    return docs

Similarly, you can modify the _aget_relevant_documents method:

async def _aget_relevant_documents(
    self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun
) -> List[Document]:
    """Get documents relevant for a query.

    Args:
        query: string to find relevant documents for
    Returns:
        List of relevant documents
    """
    structured_query = await self.query_constructor.ainvoke(
        {"query": query}, config={"callbacks": run_manager.get_child()}
    )
    if self.verbose:
        logger.info(f"Generated Query: {structured_query}")

    # Check if the UAN number is present in the structured_query
    if 'UAN' not in structured_query:
        return []  # or return a custom message

    new_query, search_kwargs = self._prepare_query(query, structured_query)
    docs = await self._aget_docs_with_query(new_query, search_kwargs)
    return docs

Please note that you need to replace 'UAN' with the actual key or method to check the presence of the UAN number in the structured_query.

You can find the relevant code in the libs/langchain/langchain/retrievers/self_query/base.py file.

Regarding the structure of the structured_query object, it is a representation of a structured query that can be translated into a valid Milvus query. It contains two main attributes: filter and query. The UAN number is not directly represented in the StructuredQuery object. However, it could be part of the query attribute or it could be used as a value in the filter attribute, depending on the specific use case.

As for the call function, it is not explicitly defined in the provided context. However, there is an asynchronous version of it, _acall, which is a method of an unspecified class. The _acall function is responsible for running the agent response. It takes two parameters: inputs which is a dictionary of input values, and run_manager which is an optional instance of AsyncCallbackManagerForChainRun.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.