langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
93.09k stars 14.97k forks source link

SQLDatabaseChain - Questions/error #6870

Closed sofgun1 closed 4 months ago

sofgun1 commented 1 year ago

System Info

While trying to run example from https://python.langchain.com/docs/modules/chains/popular/sqlite I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[42], line 1
----> 1 db_chain.run("What is the expected assisted goals by Erling Haaland")

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:273, in Chain.run(self, callbacks, tags, *args, **kwargs)
    271     if len(args) != 1:
    272         raise ValueError("`run` supports only one positional argument.")
--> 273     return self(args[0], callbacks=callbacks, tags=tags)[_output_key]
    275 if kwargs and not args:
    276     return self(kwargs, callbacks=callbacks, tags=tags)[_output_key]

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:149, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, include_run_info)
    147 except (KeyboardInterrupt, Exception) as e:
    148     run_manager.on_chain_error(e)
--> 149     raise e
    150 run_manager.on_chain_end(outputs)
    151 final_outputs: Dict[str, Any] = self.prep_outputs(
    152     inputs, outputs, return_only_outputs
    153 )

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:143, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, include_run_info)
    137 run_manager = callback_manager.on_chain_start(
    138     dumpd(self),
    139     inputs,
    140 )
    141 try:
    142     outputs = (
--> 143         self._call(inputs, run_manager=run_manager)
    144         if new_arg_supported
    145         else self._call(inputs)
    146     )
    147 except (KeyboardInterrupt, Exception) as e:
    148     run_manager.on_chain_error(e)

File ~/.local/lib/python3.10/site-packages/langchain/chains/sql_database/base.py:105, in SQLDatabaseChain._call(self, inputs, run_manager)
    103 # If not present, then defaults to None which is all tables.
    104 table_names_to_use = inputs.get("table_names_to_use")
--> 105 table_info = self.database.get_table_info(table_names=table_names_to_use)
    106 llm_inputs = {
    107     "input": input_text,
    108     "top_k": str(self.top_k),
   (...)
    111     "stop": ["\nSQLResult:"],
    112 }
    113 intermediate_steps: List = []

File ~/.local/lib/python3.10/site-packages/langchain/sql_database.py:289, in SQLDatabase.get_table_info(self, table_names)
    287     table_info += f"\n{self._get_table_indexes(table)}\n"
    288 if self._sample_rows_in_table_info:
--> 289     table_info += f"\n{self._get_sample_rows(table)}\n"
    290 if has_extra_info:
    291     table_info += "*/"

File ~/.local/lib/python3.10/site-packages/langchain/sql_database.py:311, in SQLDatabase._get_sample_rows(self, table)
    308 try:
    309     # get the sample rows
    310     with self._engine.connect() as connection:
--> 311         sample_rows_result = connection.execute(command)  # type: ignore
    312         # shorten values in the sample rows
    313         sample_rows = list(
    314             map(lambda ls: [str(i)[:100] for i in ls], sample_rows_result)
    315         )

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1413, in Connection.execute(self, statement, parameters, execution_options)
   1411     raise exc.ObjectNotExecutableError(statement) from err
   1412 else:
-> 1413     return meth(
   1414         self,
   1415         distilled_parameters,
   1416         execution_options or NO_OPTIONS,
   1417     )

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:483, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options)
    481     if TYPE_CHECKING:
    482         assert isinstance(self, Executable)
--> 483     return connection._execute_clauseelement(
    484         self, distilled_params, execution_options
    485     )
    486 else:
    487     raise exc.ObjectNotExecutableError(self)

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1629, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options)
   1621 schema_translate_map = execution_options.get(
   1622     "schema_translate_map", None
   1623 )
   1625 compiled_cache: Optional[CompiledCacheType] = execution_options.get(
   1626     "compiled_cache", self.engine._compiled_cache
   1627 )
-> 1629 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
   1630     dialect=dialect,
   1631     compiled_cache=compiled_cache,
   1632     column_keys=keys,
   1633     for_executemany=for_executemany,
   1634     schema_translate_map=schema_translate_map,
   1635     linting=self.dialect.compiler_linting | compiler.WARN_LINTING,
   1636 )
   1637 ret = self._execute_context(
   1638     dialect,
   1639     dialect.execution_ctx_cls._init_compiled,
   (...)
   1647     cache_hit=cache_hit,
   1648 )
   1649 if has_events:

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:684, in ClauseElement._compile_w_cache(self, dialect, compiled_cache, column_keys, for_executemany, schema_translate_map, **kw)
    682 else:
    683     extracted_params = None
--> 684     compiled_sql = self._compiler(
    685         dialect,
    686         cache_key=elem_cache_key,
    687         column_keys=column_keys,
    688         for_executemany=for_executemany,
    689         schema_translate_map=schema_translate_map,
    690         **kw,
    691     )
    693     if not dialect._supports_statement_cache:
    694         cache_hit = dialect.NO_DIALECT_SUPPORT

File /opt/conda/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:288, in CompilerElement._compiler(self, dialect, **kw)
    286 if TYPE_CHECKING:
    287     assert isinstance(self, ClauseElement)
--> 288 return dialect.statement_compiler(dialect, self, **kw)

File ~/.local/lib/python3.10/site-packages/pybigquery/sqlalchemy_bigquery.py:137, in BigQueryCompiler.__init__(self, dialect, statement, column_keys, inline, **kwargs)
    135 if isinstance(statement, Column):
    136     kwargs['compile_kwargs'] = util.immutabledict({'include_table': False})
--> 137 super(BigQueryCompiler, self).__init__(dialect, statement, column_keys, inline, **kwargs)

TypeError: SQLCompiler.__init__() got multiple values for argument 'cache_key'

I use a VertexAI model (MODEL_TEXT_BISON_001) as the LLM.

Some essential library versions: langchain == 0.0.206 SQLAlchemy == 2.0.11 ipython == 8.12.1 python == 3.10.10 google-cloud-bigquery == 3.10.0 google-cloud-bigquery-storage == 2.16.2

Who can help?

@hwchase17

Information

Related Components

Reproduction

db = SQLDatabase.from_uri(f"bigquery://{project_id}/{dataset}") toolkit = SQLDatabaseToolkit(llm=llm, db=db) db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True) db_chain.run("What is the sport that generates highest revenue")`

Expected behavior

When running the db_chain, I expect to get an answer from the bigquery database

dosubot[bot] commented 1 year ago

Answer generated by a 🤖

Answer

The error you're encountering is due to a conflict between the versions of SQLAlchemy and google-cloud-bigquery libraries you're using. The error message TypeError: SQLCompiler.__init__() got multiple values for argument 'cache_key' indicates that the cache_key argument is being passed multiple times when initializing the SQLCompiler, which is not allowed. This is likely due to changes in the SQLAlchemy library's API in version 2.0.11 that are not compatible with the version of google-cloud-bigquery you're using.

To resolve this issue, you can try downgrading your SQLAlchemy library to a version that is compatible with google-cloud-bigquery. For example, you can try downgrading to SQLAlchemy version 1.4.22. You can do this by running the following command in your terminal:

pip install SQLAlchemy==1.4.22

After downgrading SQLAlchemy, try running your script again. If the problem persists, you may need to adjust the versions of your other libraries to ensure compatibility. Please note that the version 1.4.22 is just an example, you might need to find the exact version that is compatible with your google-cloud-bigquery version. You can find this information in the google-cloud-bigquery's documentation or you can try different versions until you find one that works. [docs/snippets/modules/chains/popular/sqlite.mdx, docs/extras/guides/evaluation/sql_qa_benchmarking_chinook.ipynb, tests/integration_tests/chains/test_sql_database.py, langchain/chains/sql_database/base.py, docs/extras/ecosystem/integrations/databricks/databricks.ipynb]

Sources


This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

zainansarii commented 1 year ago

Hi, did you manage to resolve this issue? I'm having a similar problem

mitbal commented 1 year ago

Same here, I encountered similar error and cannot find any solution

abdulmateen59 commented 1 year ago

Downgrading "sqlalchemy-bigquery" to 1.7.0 gets it to work, but with the latest version (1.8.0) the same problem occurs.

pip install sqlalchemy-bigquery==1.7.0
dosubot[bot] commented 10 months ago

Hi, @sofgun1! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you encountered a TypeError when trying to run an example from the SQLDatabaseChain module. It seems that there was a conflict between the versions of SQLAlchemy and google-cloud-bigquery libraries being used. @abdulmateen59 suggested downgrading "sqlalchemy-bigquery" to version 1.7.0 as a workaround for this issue.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

Krithika-Devi commented 10 months ago

I'm facing the same problem. Using 1.7.0 version for "sqlalchemy-bigquery" is not working and still same persists.

Any alternate solution will be helpful.

AmineDjeghri commented 9 months ago

i'm facing the same problem

AmineDjeghri commented 9 months ago

@Krithika-Devi did you solve this ?

AmineDjeghri commented 9 months ago

Solved with :

   # replace "x" with your project info
    creds_file = "creds.json"
    project = "x"
    dataset= "x"
    sqlalchemy_url = f"bigquery://{project}/{dataset}?credentials_path={creds_file}"
    db = SQLDatabase.from_uri(sqlalchemy_url)

    # your code

    toolkit = SQLDatabaseToolkit(db=db, llm=langchain_model.llm)
        input = f"""
      I want to know the 5 top results of NUMCLI :
      - table name : {table_name}

    """
    agent_executor = create_sql_agent(
        llm=llm,
        toolkit=toolkit,
        verbose=True,
        top_k=1000,
        handle_parsing_errors=True,
    )
    agent_executor.run(input)

dependencies :

mfernandezsidn commented 8 months ago

Any solution to this problem?

AmineDjeghri commented 8 months ago

Any solution to this problem?

@mfernandezsidn You can try to change the version of your dependencies :

Sqlalchemy 1.4.11 Sqalchemy-bigquery 1.9.0 Sqlitedict 2.1.0 Langchain 0.0.352 Google cloud bigquery 3.14.1 Google cloud bigquery storage 2.24.0

I also had a problem because of two sqlachemy libraries . Langchain was using the wrong one, so I had to delete it. Make sure that langchain is calling the right library

mfernandezsidn commented 8 months ago

@AmineDjeghri is it important the langchain version? I need that version because of compatibilities with another libraries.

My LLM is being correctly executed (TextGenerationModel.from_pretrained("text-bison-32k"))

` agent_executor = create_sql_agent( llm=llm, toolkit=toolkit, verbose=True, top_k=1000, handle_parsing_errors=True, )

output = agent_executor.run(final_prompt)`

When the last line finishes executing, it returns the error of the cache. Any idea?

AmineDjeghri commented 8 months ago

Can you post your error ?

mfernandezsidn commented 8 months ago

@AmineDjeghri this is the error trace:

Entering new AgentExecutor chain... Se esta ejecutando el call 1 Se esta ejecutando el call 2 Question: What are the top 5 pages with the most impressions on the website in the last 30 days? Thought: To answer this question, I need to find the pages with the most impressions in the last 30 days. I can do this by querying the searchconsole.searchdata_site_impression table and filtering for the appropriate date range. I will also order the results by the number of impressions in descending order and limit the results to the top 5. Action: sql_db_query Action Input: sql SELECT page, SUM(impressions) AS total_impressions FROM `searchconsole.searchdata_site_impression` WHERE DATE BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) AND CURRENT_DATE() GROUP BY page ORDER BY total_impressions DESC LIMIT 5; C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_community\utilities\sql_database.py:413: SAWarning: Dialect bigquery:bigquery will not make use of SQL compilation caching as it does not set the 'supports_statement_cache' attribute to True. This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions. Dialect maintainers should seek to set this attribute to True after appropriate development and testing for SQLAlchemy 1.4 caching support. Alternatively, this attribute may be set to False which will disable this warning. (Background on this warning at: https://sqlalche.me/e/20/cprf) cursor = connection.execute(text(command)) Traceback (most recent call last): File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\gradio\routes.py", line 534, in predict output = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\gradio\route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\gradio\blocks.py", line 1550, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\gradio\blocks.py", line 1185, in call_function prediction = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\anyio_backends_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future ^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\anyio_backends_asyncio.py", line 851, in run result = context.run(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\gradio\utils.py", line 661, in wrapper response = f(args, kwargs) ^^^^^^^^^^^^^^^^^^ File "c:\Users\my.user\Downloads\datalitica_dash\app\services\langchain\test.py", line 87, in bq_qna output = agent_executor.run(final_prompt) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_core_api\deprecation.py", line 145, in warning_emitting_wrapper return wrapped(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain\chains\base.py", line 538, in run return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_core_api\deprecation.py", line 145, in warning_emitting_wrapper return wrapped(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain\chains\base.py", line 363, in call return self.invoke( ^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain\chains\base.py", line 162, in invoke raise e File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain\chains\base.py", line 156, in invoke self._call(inputs, run_manager=run_manager) File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain\agents\agent.py", line 1371, in _call next_step_output = self._take_next_step( ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain\agents\agent.py", line 1097, in _take_next_step [ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain\agents\agent.py", line 1097, in [ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain\agents\agent.py", line 1193, in _iter_next_step observation = tool.run( ^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_core\tools.py", line 374, in run raise e File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_core\tools.py", line 346, in run self._run(*tool_args, run_manager=run_manager, tool_kwargs) File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_community\tools\sql_database\tool.py", line 43, in _run
return self.db.run_no_throw(query) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_community\utilities\sql_database.py", line 484, in run_no_throw return self.run(command, fetch, include_columns) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_community\utilities\sql_database.py", line 436, in run
result = self._execute(command, fetch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\langchain_community\utilities\sql_database.py", line 413, in _execute
cursor = connection.execute(text(command)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\sqlalchemy\engine\base.py", line 1416, in execute return meth( ^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\sqlalchemy\sql\elements.py", line 517, in _execute_on_connection
return connection._execute_clauseelement( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\sqlalchemy\engine\base.py", line 1631, in _execute_clauseelement
compiled_sql, extracted_params, cache_hit = elem._compile_w_cache( ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\sqlalchemy\sql\elements.py", line 718, in _compile_w_cache compiled_sql = self._compiler( ^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\sqlalchemy\sql\elements.py", line 317, in _compiler return dialect.statement_compiler(dialect, self,
kw) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\my.user\AppData\Local\anaconda3\envs\datalitica\Lib\site-packages\pybigquery\sqlalchemy_bigquery.py", line 137, in init super(BigQueryCompiler, self).init(dialect, statement, column_keys, inline, **kwargs) TypeError: SQLCompiler.init() got multiple values for argument 'cache_key'

These are the lines of code:

` llm = VertexLLM( model_name='text-bison-32k', max_output_tokens=256, temperature=0, top_p=0.8, top_k=40, verbose=True, )

table_names_list = list(table_names_options)

db = SQLDatabase.from_uri(f"bigquery://{project_id}") # Establish the database connection
toolkit = SQLDatabaseToolkit(db=db, llm=llm)

vertexai.init(project=project_id, location="us-central1")

_googlesql_prompt = """Any Questions Asked needs to be first Translated to English and then believe that You are a GoogleSQL expert. Given an input question, first create a syntactically correct GoogleSQL query to run, then look at the results of the query and return the answer to the input question.
Unless the user specifies in the question a specific number of examples to obtain, query for at most {top_k} results using the LIMIT clause as per GoogleSQL. You can order the results to return the most informative data in the database.
Never query for all columns from a table. You must query only the columns that are needed to answer the question. Wrap each column name in backticks (`) to denote them as delimited identifiers.
Pay attention to use only the column names you can see in the tables below. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
Use the following format:
Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"
Only use the following tables:
{table_info}

If someone asks for aggregation on a STRING data type column, then CAST column as NUMERIC before you do the aggregation.

If someone asks for a specific month, use ActivityDate between the current month's start date and the current month's end date

If someone asks for column names in the table, use the following format:
SELECT column_name
FROM `{project_id}.{dataset_id}`.INFORMATION_SCHEMA.COLUMNS
WHERE table_name in {table_info}

Question: {input}"""

GOOGLESQL_PROMPT = PromptTemplate(
    input_variables=["input", "table_info", "top_k", "project_id", "dataset_id"],
    template=_googlesql_prompt,
)

final_prompt = GOOGLESQL_PROMPT.format(
    input=question, 
    project_id=project_id, 
    dataset_id=dataset_id, 
    table_info=table_names_list, 
    top_k=10000
)

agent_executor = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True,
    top_k=1000,
    handle_parsing_errors=True,
)

output = agent_executor.run(final_prompt)

`

VertexLLM is a custom class:

` class VertexLLM(VertexCommon, LLM):
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str: print("Se esta ejecutando el call 1")

    """
        This function is to make predictions using the specified model.
    """
    return self._predict(prompt, stop)

`

And VertexCommon:

` class CustomMetaClass(ModelMetaclass): def new(mcs, name, bases, attrs): cls = type.new(mcs, name, bases, attrs) return cls

class VertexCommon(BaseModel, metaclass=CustomMetaClass): """ This class contains common properties and methods for Vertex AI models. """ client = TextGenerationModel.from_pretrained("text-bison-32k") model_name: str = "text-bison-32k" temperature: float = 0.2 top_p: int = 0.8 top_k: int = 40 max_output_tokens: int = 1024

@property
def _default_params(self):
    return {
        "temperature": self.temperature,
        "top_p": self.top_p,
        "top_k": self.top_k,
        "max_output_tokens": self.max_output_tokens
    }

def _predict(self, prompt: str, stop: Optional[List[str]]) -> str:
    """
        This function is to make predictions based on a given prompt.
    """
    res = self.client.predict(prompt, **self._default_params)
    print("Se esta ejecutando el call 2")
    return self._enforce_stop_words(res.text, stop)

def _enforce_stop_words(self, text: str, stop: Optional[List[str]]) -> str:
    """
        This function is to enforce stop words in the generated text.
    """
    if stop: 
        return enforce_stop_tokens(text, stop)
    return text

@property
def _llm_type(self) -> str:
    return "vertex_ai"

`

The code is basically from: https://medium.com/@rakeshmohandas/effortless-multilingual-data-querying-a-generative-ai-system-for-bigquery-nlqs-8be06ab18f5b

mfernandezsidn commented 8 months ago

The problem seems to be with the pybigquery but I don't see you are using it in the list of dependencies you gave me. I have removed the library but I do receive the error:

File "C:\Users\my.user\AppData\Local\anaconda3\envs\ddd\Lib\site-packages\sqlalchemy\util\langhelpers.py", line 374, in load raise exc.NoSuchModuleError( sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:bigquery

mfernandezsidn commented 8 months ago

That was the exact problem!! The pybigquery was messing everything!

AmineDjeghri commented 8 months ago

It's the same library that caused me a problem and i needed to remove. Glad you found the source of the problem

dosubot[bot] commented 8 months ago

🤖

It seems like you're encountering a TypeError when trying to run a query using the SQLDatabaseChain with a BigQuery database. The error message indicates that there's an issue with the SQLCompiler.__init__() method receiving multiple values for the argument 'cache_key'.

This error might be due to a compatibility issue between the versions of SQLAlchemy and google-cloud-bigquery you're using. The SQLAlchemy version you're using is 2.0.11, which is quite recent, and there might be some changes in this version that are not yet fully compatible with google-cloud-bigquery.

As a first step, I would recommend trying to downgrade your SQLAlchemy version to 1.4.x and see if the issue persists. You can do this by running the following command in your terminal:

pip install SQLAlchemy==1.4.22

If downgrading SQLAlchemy doesn't resolve the issue, you might also want to try downgrading google-cloud-bigquery to a version that is known to work well with SQLAlchemy 1.4.x, such as 2.6.2. You can do this by running:

pip install google-cloud-bigquery==2.6.2

Please try these steps and let me know if the issue persists. If it does, it would be helpful if you could provide more information about your environment, such as the operating system you're using and the version of Python. This will help us better understand the issue and provide a more accurate solution.

For more information about the compatibility between SQLAlchemy and google-cloud-bigquery, you can refer to the SQLAlchemy Changelog and the google-cloud-bigquery Changelog.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jonahatathenic commented 2 months ago

The problem seems to be with the pybigquery but I don't see you are using it in the list of dependencies you gave me. I have removed the library but I do receive the error:

File "C:\Users\my.user\AppData\Local\anaconda3\envs\ddd\Lib\site-packages\sqlalchemy\util\langhelpers.py", line 374, in load raise exc.NoSuchModuleError( sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:bigquery

I am getting the same error when I remove the pybigquery library did you find a solution to this?

go4xtremeanalytics commented 1 month ago

I also got the same issue and is resolved by lower python version. Below are my dependencies

Python: 3.8.19 pandas==2.0.3 numpy==1.24.4 streamlit==1.36.0 openai==1.33.0 langchain==0.1.20 langchain-openai==0.1.16 faiss-cpu==1.8.0 langchain_community==0.0.38 pymysql==1.1.1 sqlalchemy==2.0.30 pybigquery google-cloud-bigquery sqlalchemy.orm