Dataherald / dataherald

Interact with your SQL database, Natural Language to SQL using LLMs
https://dataherald.readthedocs.io/en/latest/
Apache License 2.0
3.3k stars 230 forks source link

bson.errors.InvalidDocument: cannot encode object: Decimal #130

Closed rmaroun closed 1 year ago

rmaroun commented 1 year ago

When running a NL question on my dataset which then return a result as follow:

Action: sql_db_query
2023-08-28 17:55:37 Observation: [(Decimal('4867.6282051282051282'),)]
2023-08-28 17:55:38 Thought:The highest cost per kilogram is 4867.63.
2023-08-28 17:55:38 Final Answer: 4867.63

I am then getting the following error:

bson.errors.InvalidDocument: cannot encode object: Decimal('4867.6282051282051282'), of type: <class 'decimal.Decimal'>

Looked in the simple_evaluator and I can see the run_result is equal to [(Decimal('4867.6282051282051282'),)] and that seems to be sent to MongoDb for storage but then the error shows up.

@MohammadrezaPourreza any idea on how to fix this?

MohammadrezaPourreza commented 1 year ago

Hello @rmaroun , thank you for flagging this issue. Could you kindly specify which of the endpoints is triggering this error?

Based on the information you've provided, it seems that the exception is occurring during the insertion of the generated_answer into the NLQueryResponseRepository. This issue arises because the BSON encoding library used by MongoDB doesn't support direct encoding of decimal.Decimal objects. One possible solution is to cast it to a float before inserting it into MongoDB.

rmaroun commented 1 year ago

Hello @MohammadrezaPourreza you are right indeed it was when generated_answer is being persisted in MongoDb.

the following code helped solve it:

for col in generated_answer.sql_query_result.columns:
      for row in generated_answer.sql_query_result.rows:
            if isinstance(row[col], Decimal):
                # Explicitly convert to float to ensure serialization
                row[col] = float(row[col])

should I submit a PR for it?

MohammadrezaPourreza commented 1 year ago

@rmaroun Thank you for identifying and resolving this issue. Please consider submitting a PR. Your contribution to our codebase is greatly valued and appreciated.

rmaroun commented 1 year ago

@MohammadrezaPourreza PR submitted https://github.com/Dataherald/dataherald/pull/133