INWTlab / dbrequests

python package built for easy use of raw SQL within python and pandas projects
MIT License
5 stars 2 forks source link

json and decimal data types #30

Closed Cattes closed 4 years ago

Cattes commented 4 years ago

During development using the mysql.database Database class we noticed, that writing to columns of type json and type decimal is not working correctly with missing values.

I tried to create a reproduceable example

In conftest.py I added:

    db.bulk_query("""
        CREATE TABLE `instruments` 
            (
                `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
                `membership` json DEFAULT NULL,
                `average` decimal(8,2) DEFAULT NULL,
                PRIMARY KEY (`id`)
            ) ENGINE=InnoDB AUTO_INCREMENT=37974 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
    """)
    db.bulk_query("""
            INSERT INTO instruments ( id, membership, average) VALUES
            ( 1, '{"key1": "val1", "2": "val2"}', 1.03 ),
            ( 2, '{"key3":"val3", 4: "val4"}', 0.03 ),
            ( 3, NULL, NULL );
        """)

I also added a test

    def test_update_json_and_decimal(self, db):
        df_update = pd.DataFrame({'id': [2, 3, 4, 5],
                                  'membership': [
                                      '{"BookClub": 1, "SportsClub": 1, "ClubClub": 1,}',
                                      '{"BookClub": 0, "SportsClub": 0.5, "ClubClub": 0}',
                                      '{"BookClub": null, "SportsClub": 1, "ClubClub": 2}',
                                      None],
                                  'average': [34.49, 34.51, 43.86, None]})

        db.send_data(df_update, 'instruments', mode='update')
        df_in_db = db.send_query("SELECT * FROM instruments ")

This returns a malspecified df_in_db

df_in_db
Out[35]: 
   id                                         membership average
0   1                      {"key1": "val1", "2": "val2"}    1.03
1   2   {"BookClub": 1, "SportsClub": 1, "ClubClub": 1,}   34.49
2   3  {"BookClub": 0, "SportsClub": 0.5, "ClubClub": 0}   34.51
3   4  {"BookClub": null, "SportsClub": 1, "ClubClub"...   43.86
4   5                                                       0.00

The problem is, that the default value and missing values are not associated correctly.

When running this code on the real database with restriced priviliges, the writing did not work at all. 1st because '' is not a valid input for the decim column. And the json column was not receiving any data, giving an error of the form 'Invalid JSON text: "Missing a name for object member." at position 1 In this example the writing did work, but the None values are not processed correctly.

For the decim column, the missing value should be \N (see https://stackoverflow.com/a/2675493 and the linked mysql docu) I think that may also be the solution for the json '' problem