mindsdb / mindsdb

The platform for building AI from enterprise data
https://mindsdb.com
Other
26.69k stars 4.87k forks source link

[Bug]: Mongo API fails running predictor #3297

Closed akhildevelops closed 2 years ago

akhildevelops commented 2 years ago

Is there an existing issue for this?

Current Behavior

I've found that mongo API method db.predictors.insert fails running a predictor on a dataset.

Below is the error:

MongoBulkWriteError: (sqlite3.InterfaceError) Error binding parameter 11 - probably unspported type.

Expected Behavior

Should be able to run predictor.

Steps To Reproduce

- Spin up mindsdb local / cloud instance
- Run a mongodb instance from docker or mongoatlas and load data from https://www.kaggle.com/datasets/arashnic/learn-time-series-forecasting-from-gold-price to `public` database and `gold_data` collection
- Using monogo shell connect to mindsdb: `mongosh --host <host_name> -u <user_name> -p <password>`
- switch to `mindsdb` database by `use mindsdb`
- Create database that has connection details about mongodb instance: `db.databases.insertOne({name:"mongo_data",engine:"mongodb",connection_args:{"port":27017,"host":"mongodb://localhost","database":"public"}})`
- Create the predictor: `db.predictors.insert({name:"predict_material",predict:"Value",connection:"mongo_data",select_data_query:{"collection":"gold_data","call":[{"method":"find","args":[]}]}})`

Anything else?

No response

akhildevelops commented 2 years ago

I'll work on this

akhildevelops commented 2 years ago

The error is caused due to the ordered dict type of select_data_query field sent in the method db.predictors.insert.

Refer to: https://docs.mindsdb.com/mongo/insert/

akhildevelops commented 2 years ago

Hi @ZoranPandovski can you please label this issue as part of hacktoberfest

ea-rus commented 2 years ago

@Enforcer007, could you show mongo query that raises error described in this issue?

akhildevelops commented 2 years ago

This is the error @ea-rus :

mindsdb> db.predictors.insert({name:"gold_data_predict_v8",predict:["Value"],connection:"mongo_local_v3",select_data_query:{"collection":"gold_data","call":[{"method":"find","args":[{},{"_id":0}]}]}})
Uncaught:
MongoBulkWriteError: (sqlite3.InterfaceError) Error binding parameter 11 - probably unsupported type.
[SQL: INSERT INTO predictor (updated_at, created_at, deleted_at, name, data, to_predict, company_id, mindsdb_version, native_version, integration_id, data_integration_id, fetch_data_query, is_custom, learn_args, update_status, status, active, training_data_columns_count, training_data_rows_count, training_start_at, training_stop_at, code, lightwood_version, dtype_dict) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: ('2022-10-11 22:31:21.798031', '2022-10-11 22:31:21.798035', None, 'gold_data_predict_v8', '{"name": "gold_data_predict_v8"}', 'Value', None, '22.9.5.4', None, 1, 9, OrderedDict([('collection', 'gold_data'), ('call', [OrderedDict([('method', 'find'), ('args', [OrderedDict(), OrderedDict([('_id', 0)])])])])]), None, '{"target": "Value", "pct_invalid": 2, "unbias_target": true, "seconds_per_mixer": null, "seconds_per_encoder": null, "expected_additional_time": null ... (333 characters truncated) ... eriods": []}, "anomaly_detection": false, "use_default_analysis": true, "ignore_features": [], "fit_on_all": true, "strict_mode": true, "seed_nr": 1}', 'up_to_date', 'generating', 1, 2, 10787, '2022-10-11 22:31:21.797777', None, None, '22.9.1.0', None)]
(Background on this error at: https://sqlalche.me/e/14/rvf5)
Result: BulkWriteResult {
  result: {
    ok: 1,
    writeErrors: [
      WriteError {
        err: {
          index: 0,
          code: 0,
          errmsg: '(sqlite3.InterfaceError) Error binding parameter 11 - probably unsupported type.\n' +
            '[SQL: INSERT INTO predictor (updated_at, created_at, deleted_at, name, data, to_predict, company_id, mindsdb_version, native_version, integration_id, data_integration_id, fetch_data_query, is_custom, learn_args, update_status, status, active, training_data_columns_count, training_data_rows_count, training_start_at, training_stop_at, code, lightwood_version, dtype_dict) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]\n' +
            `[parameters: ('2022-10-11 22:31:21.798031', '2022-10-11 22:31:21.798035', None, 'gold_data_predict_v8', '{"name": "gold_data_predict_v8"}', 'Value', None, '22.9.5.4', None, 1, 9, OrderedDict([('collection', 'gold_data'), ('call', [OrderedDict([('method', 'find'), ('args', [OrderedDict(), OrderedDict([('_id', 0)])])])])]), None, '{"target": "Value", "pct_invalid": 2, "unbias_target": true, "seconds_per_mixer": null, "seconds_per_encoder": null, "expected_additional_time": null ... (333 characters truncated) ... eriods": []}, "anomaly_detection": false, "use_default_analysis": true, "ignore_features": [], "fit_on_all": true, "strict_mode": true, "seed_nr": 1}', 'up_to_date', 'generating', 1, 2, 10787, '2022-10-11 22:31:21.797777', None, None, '22.9.1.0', None)]\n` +
            '(Background on this error at: https://sqlalche.me/e/14/rvf5)',
          errInfo: undefined,
          op: {
            name: 'gold_data_predict_v8',
            predict: [ 'Value' ],
            connection: 'mongo_local_v3',
            select_data_query: { collection: 'gold_data', call: [Array] },
            _id: ObjectId("6345a161510bf2ba2548799b")
          }
        }
      }
    ],
    writeConcernErrors: [],
    insertedIds: [ { index: 0, _id: ObjectId("6345a161510bf2ba2548799b") } ],
    nInserted: 0,
    nUpserted: 0,
    nMatched: 0,
    nModified: 0,
    nRemoved: 0,
    upserted: []
  }
}
ea-rus commented 2 years ago

@ZoranPandovski As I remember I removed this bulky way to pass select_data_query for long ago.

 {
            "collection": "house_sales", 
            "call": [{
                    "method": "find",
                    "args": []
             }]
     }

But it is still in doc: https://docs.mindsdb.com/mongo/insert/

Now right way is pass just raw string to destination database, as described in my local doc (db.house_sales.find({})) https://github.com/mindsdb/mindsdb/blob/staging/mindsdb/api/mongo/README.md#creating-predictor