patterns-ai-core / langchainrb

Build LLM-powered applications in Ruby
https://rubydoc.info/gems/langchainrb
MIT License
1.18k stars 156 forks source link

Tool Langchain::Tool::Database: Spaces in function call "database__describe_tables" arguments generate error #679

Open fguillen opened 3 days ago

fguillen commented 3 days ago

Describe the bug When using Langchain::Tool::Database with Postgres adaptor it, sometimes, generate no valid arguments for the function name "database__describe_tables"

For example:

        "tool_calls": [
          {
            "id": "call_pgyoQ7Dzz4MeSJ4AUEnuQpux",
            "type": "function",
            "function": {
              "name": "database__describe_tables",
              "arguments": "{\"tables\":\"dashboard_customernode, dashboard_invoiceline\"}"
            }
          }
        ]

Generates this error:

request_id: 5bc7b95e-5d4a-4230-b1f3-3ac6c7644f96} Rails -- Exception: Sequel::DatabaseError: PG::UndefinedTable: ERROR:  relation " dashboard_invoiceline" does not exist
LINE 1: ..."."attnum" > 0) AND ("pg_class"."oid" = CAST(CAST('" dashboa...
                                                             ^

If I add this system instructions:

When use "database__describe_tables" function name don't put space between the "tables" arguments

It is fixed:

       "tool_calls": [
          {
            "id": "call_ncdQLaeEGaVdOJ0d9fArk7Hb",
            "type": "function",
            "function": {
              "name": "database__describe_tables",
              "arguments": "{\"tables\":\"dashboard_customernode,dashboard_invoiceline\"}"
            }
          }
        ]

To Reproduce As everything related to LLM it is difficult to reproduce

Expected behavior

The Database Tool should generate valid function calls, in this case, not include spaces in the arguments list

Terminal commands & output Commands you used and the terminal output.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context

Here is the full stack trace:

2024-06-26 07:36:40.872599 D [57454:puma srv tp 001] {request_id: 5bc7b95e-5d4a-4230-b1f3-3ac6c7644f96} Rails --   HTTP POST (1239.44ms)   https://api.openai.com:443/v1/chat/completions
2024-06-26 07:36:40.872662 D [57454:puma srv tp 001] {request_id: 5bc7b95e-5d4a-4230-b1f3-3ac6c7644f96} Rails --   Request body   {"messages":[{"role":"system","content":"Your name is Dashboard Assistant, and you have access to the dashboard database to answer user questions about products and sales.\n\nMake sure to answer any user questions related to sales and products using information from the database tables.\n\nYou are an SQL and PostgreSQL expert.\n\n## Constraints:\n\n- Understand the user's intention based on their question, and use the given table structures defined in the database to create a grammatically correct SQL query. If SQL is not required, answer the user's question directly.\n- Always limit the query to a maximum of 50 results unless the user specifies a different number of rows they wish to obtain.\n- Be careful not to mistake the relationship between tables and columns when generating SQL queries.\n- Check the correctness of the SQL and ensure that the query performance is optimized.\n- Use only tables and fields that you find in the database schema.\n- Use only SQL commands and functions valid for a PostgreSQL DB.\n- When building queries, ensure to prevent possible errors of \"division by zero.\"\n- Use only pure SQL commands, not Rails integration.\n- The field \"dashboard_invoiceline.margin\" is in euros, not percentage.\n- When showing a list of clients, always include the ID and the Name of the client, in addition to other relevant fields relative to the user's question.\n- When showing results, sort them in the best way relevant to the user's question."},{"role":"system","content":"Use only pure SQL commands, not Rails integration"},{"role":"user","content":"Dime clientes que tengan un porcentage de margen medio en el último mes inferior al 25%"},{"role":"assistant","content":"","tool_calls":[{"id":"call_CpAxJvNUamtbsfRvgDDGl9va","type":"function","function":{"name":"database__list_tables","arguments":"{}"}}]},{"role":"tool","content":"[:dashboard_categorynode, :dashboard_deliverynoteline, :dashboard_businessnode, :dashboard_customernode, :dashboard_customerbudget, :dashboard_invoiceline, :dashboard_latestproductcostuploadedfile, :dashboard_product, :dashboard_salespersonnode, :dashboard_familynode, :dashboard_invoicedivision, :dashboard_latestproductcost, :dashboard_lead, :dashboard_predictiveanalytic, :dashboard_supplier, :dashboard_saleoffer, :dashboard_ecomuser, :dashboard_salespersonbudget, :dashboard_specialprice, :dashboard_supplierbudget]","tool_call_id":"call_CpAxJvNUamtbsfRvgDDGl9va"}],"tools":[{"type":"function","function":{"name":"database__describe_tables","description":"Database Tool: Returns the schema for a list of tables","parameters":{"type":"object","properties":{"tables":{"type":"string","description":"The tables to describe."}},"required":["tables"]}}},{"type":"function","function":{"name":"database__list_tables","description":"Database Tool: Returns a list of tables in the database","parameters":{"type":"object","properties":{},"required":[]}}},{"type":"function","function":{"name":"database__execute","description":"Database Tool: Executes a SQL query and returns the results","parameters":{"type":"object","properties":{"input":{"type":"string","description":"SQL query to be executed"}},"required":["input"]}}}],"tool_choice":"auto","model":"gpt-4o","temperature":0.0,"n":1}
2024-06-26 07:36:40.872683 D [57454:puma srv tp 001] {request_id: 5bc7b95e-5d4a-4230-b1f3-3ac6c7644f96} Rails --   Response status   Net::HTTPOK (200)
2024-06-26 07:36:40.872703 D [57454:puma srv tp 001] {request_id: 5bc7b95e-5d4a-4230-b1f3-3ac6c7644f96} Rails --   Response body   {
  "id": "chatcmpl-9eFceNUqNNShPQM5ZIfCXKyTwvvVa",
  "object": "chat.completion",
  "created": 1719380200,
  "model": "gpt-4o-2024-05-13",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_pgyoQ7Dzz4MeSJ4AUEnuQpux",
            "type": "function",
            "function": {
              "name": "database__describe_tables",
              "arguments": "{\"tables\":\"dashboard_customernode, dashboard_invoiceline\"}"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 558,
    "completion_tokens": 25,
    "total_tokens": 583
  },
  "system_fingerprint": "fp_d576307f90"
}

2024-06-26 07:36:40.901688 I [57454:puma srv tp 001] {request_id: 5bc7b95e-5d4a-4230-b1f3-3ac6c7644f96} (2.088s) Front::MessagesController -- Completed #create -- { :controller => "Front::MessagesController", :action => "create", :params => { "authenticity_token" => "[FILTERED]", "message" => { "content" => "Dime clientes que tengan un porcentage de margen medio en el último mes inferior al 25%", "role" => "user" }, "button" => "", "conversation_id" => "3MvAYhi23oG0kTys2LGJPB" }, :format => "TURBO_STREAM", :method => "POST", :path => "/front/conversations/3MvAYhi23oG0kTys2LGJPB/messages", :status => 500, :view_runtime => 0.0, :db_runtime => 3.12, :exception_object => #<Sequel::DatabaseError:"PG::UndefinedTable: ERROR:  relation \" dashboard_invoiceline\" does not exist\nLINE 1: ...\".\"attnum\" > 0) AND (\"pg_class\".\"oid\" = CAST(CAST('\" dashboa...\n                                                             ^\n">, :allocations => 13392, :status_message => "Internal Server Error" }
2024-06-26 07:36:40.902157 F [57454:puma srv tp 001 deprecators.rb:86] {request_id: 5bc7b95e-5d4a-4230-b1f3-3ac6c7644f96} Rails -- Exception: Sequel::DatabaseError: PG::UndefinedTable: ERROR:  relation " dashboard_invoiceline" does not exist
LINE 1: ..."."attnum" > 0) AND ("pg_class"."oid" = CAST(CAST('" dashboa...
                                                             ^

/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:171:in `exec'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:171:in `block in execute_query'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/database/logging.rb:38:in `log_connection_yield'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:171:in `execute_query'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:159:in `block in execute'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:136:in `check_disconnect_errors'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:159:in `execute'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:532:in `_execute'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:348:in `block (2 levels) in execute'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:555:in `check_database_errors'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:348:in `block in execute'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/connection_pool/threaded.rb:92:in `hold'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/database/connecting.rb:293:in `synchronize'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:348:in `execute'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/dataset/actions.rb:1189:in `execute'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/postgres.rb:651:in `fetch_rows'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/dataset/actions.rb:164:in `each'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/dataset/actions.rb:51:in `block in all'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/dataset/actions.rb:1097:in `_all'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/dataset/actions.rb:51:in `all'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/dataset/actions.rb:977:in `where_all'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/adapters/shared/postgres.rb:1704:in `schema_parse_table'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/sequel-5.81.0/lib/sequel/database/query.rb:159:in `schema'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/langchainrb-0.13.4/lib/langchain/tool/database/database.rb:64:in `describe_table'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/langchainrb-0.13.4/lib/langchain/tool/database/database.rb:45:in `block in describe_tables'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/langchainrb-0.13.4/lib/langchain/tool/database/database.rb:44:in `each'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/langchainrb-0.13.4/lib/langchain/tool/database/database.rb:44:in `describe_tables'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/langchainrb-0.13.4/lib/langchain/assistants/assistant.rb:231:in `block in run_tools'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/langchainrb-0.13.4/lib/langchain/assistants/assistant.rb:218:in `each'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/langchainrb-0.13.4/lib/langchain/assistants/assistant.rb:218:in `run_tools'
/Users/me/.rbenv/versions/3.2.4/lib/ruby/gems/3.2.0/gems/langchainrb-0.13.4/lib/langchain/assistants/assistant.rb:94:in `run'
/Users/me/Development/DashboardChatbot/app/services/conversation/process_user_message_service.rb:11:in `perform'
/Users/me/Development/DashboardChatbot/app/services/service.rb:4:in `perform'
fguillen commented 1 day ago

Sadly, the model is ignoring the instructions:

When use "database__describe_tables" function name don't put space between the "tables" arguments

randomly. And the error appears and disappears. I can not work with this right now. Checking for workarounds

andreibondarev commented 1 day ago

@fguillen I actually think that we can improve the LLM outputs by changing this parameter description is to specify something like: "Comma-separated list of tables to describe. No spaces."

Could you please try that?