rqlite / sqlalchemy-rqlite

sqlalchemy dialect for rqlite, the lightweight, distributed database built on SQLite.
MIT License
42 stars 8 forks source link

SQLAlchemy ORM. Table creation command sent twice. #2

Closed guillemborrell closed 7 years ago

guillemborrell commented 7 years ago

I am using this SQLAlchemy dialect with the ORM mapper. The application crashes because the command for table creation is sent twice.

This is the error I get

2017-03-22 19:25:30,853 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
INFO:sqlalchemy.engine.base.Engine:SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2017-03-22 19:25:30,854 INFO sqlalchemy.engine.base.Engine ()
INFO:sqlalchemy.engine.base.Engine:()
2017-03-22 19:25:30,857 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
INFO:sqlalchemy.engine.base.Engine:SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2017-03-22 19:25:30,857 INFO sqlalchemy.engine.base.Engine ()
INFO:sqlalchemy.engine.base.Engine:()
2017-03-22 19:25:30,860 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("contracts")
INFO:sqlalchemy.engine.base.Engine:PRAGMA table_info("contracts")
2017-03-22 19:25:30,860 INFO sqlalchemy.engine.base.Engine ()
INFO:sqlalchemy.engine.base.Engine:()
2017-03-22 19:25:30,909 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("status")
INFO:sqlalchemy.engine.base.Engine:PRAGMA table_info("status")
2017-03-22 19:25:30,910 INFO sqlalchemy.engine.base.Engine ()
INFO:sqlalchemy.engine.base.Engine:()
2017-03-22 19:25:30,954 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("users")
INFO:sqlalchemy.engine.base.Engine:PRAGMA table_info("users")
2017-03-22 19:25:30,954 INFO sqlalchemy.engine.base.Engine ()
INFO:sqlalchemy.engine.base.Engine:()
2017-03-22 19:25:30,998 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE users (
        id INTEGER NOT NULL, 
        name VARCHAR, 
        "when" DATETIME, 
        info BLOB, 
        "key" VARCHAR, 
        PRIMARY KEY (id), 
        UNIQUE (name)
)

INFO:sqlalchemy.engine.base.Engine:
CREATE TABLE users (
        id INTEGER NOT NULL, 
        name VARCHAR, 
        "when" DATETIME, 
        info BLOB, 
        "key" VARCHAR, 
        PRIMARY KEY (id), 
        UNIQUE (name)
)

2017-03-22 19:25:30,998 INFO sqlalchemy.engine.base.Engine ()
INFO:sqlalchemy.engine.base.Engine:()
ERROR:root:{"error": "table users already exists"}
2017-03-22 19:25:31,077 INFO sqlalchemy.engine.base.Engine ROLLBACK
INFO:sqlalchemy.engine.base.Engine:ROLLBACK
Traceback (most recent call last):
  File "/home/guillem/projects/pyledger/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
    context)
  File "/home/guillem/projects/pyledger/env/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
    cursor.execute(statement, parameters)
  File "/home/guillem/projects/pyrqlite/src/pyrqlite/cursors.py", line 117, in execute
    raise Error(json.dumps(item))
sqlite3.Error: {"error": "table users already exists"}

While the correct commands sent with vanilla SQLite are the following

2017-03-22 19:27:03,735 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2017-03-22 19:27:03,735 INFO sqlalchemy.engine.base.Engine ()
2017-03-22 19:27:03,736 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2017-03-22 19:27:03,736 INFO sqlalchemy.engine.base.Engine ()
2017-03-22 19:27:03,737 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("contracts")
2017-03-22 19:27:03,737 INFO sqlalchemy.engine.base.Engine ()
2017-03-22 19:27:03,737 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("status")
2017-03-22 19:27:03,737 INFO sqlalchemy.engine.base.Engine ()
2017-03-22 19:27:03,738 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("users")
2017-03-22 19:27:03,738 INFO sqlalchemy.engine.base.Engine ()
2017-03-22 19:27:03,738 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE users (
        id INTEGER NOT NULL, 
        name VARCHAR, 
        "when" DATETIME, 
        info BLOB, 
        "key" VARCHAR, 
        PRIMARY KEY (id), 
        UNIQUE (name)
)

2017-03-22 19:27:03,739 INFO sqlalchemy.engine.base.Engine ()
2017-03-22 19:27:03,739 INFO sqlalchemy.engine.base.Engine COMMIT
2017-03-22 19:27:03,739 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE contracts (
        id INTEGER NOT NULL, 
        name VARCHAR, 
        description VARCHAR, 
        created DATETIME, 
        methods BLOB, 
        api BLOB, 
        signatures BLOB, 
        user_id INTEGER, 
        PRIMARY KEY (id), 
        FOREIGN KEY(user_id) REFERENCES users (id)
)

2017-03-22 19:27:03,739 INFO sqlalchemy.engine.base.Engine ()
2017-03-22 19:27:03,740 INFO sqlalchemy.engine.base.Engine COMMIT
2017-03-22 19:27:03,740 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE status (
        id INTEGER NOT NULL, 
        contract_id INTEGER, 
        attributes BLOB, 
        "key" BLOB, 
        "when" DATETIME, 
        owner VARCHAR, 
        PRIMARY KEY (id), 
        FOREIGN KEY(contract_id) REFERENCES contracts (id), 
        UNIQUE ("key")
)

2017-03-22 19:27:03,740 INFO sqlalchemy.engine.base.Engine ()
2017-03-22 19:27:03,740 INFO sqlalchemy.engine.base.Engine COMMIT
Warning: Syncing database at sqlite://
Warning: Admin key is 5ab7ecec-f3e9-4c11-aa9a-f8d8508dd1e9
2017-03-22 19:27:03,747 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2017-03-22 19:27:03,748 INFO sqlalchemy.engine.base.Engine INSERT INTO users (name, "when", info, "key") VALUES (?, ?, ?, ?)
2017-03-22 19:27:03,748 INFO sqlalchemy.engine.base.Engine ('admin', '2017-03-22 19:27:03.746599', None, '5ab7ecec-f3e9-4c11-aa9a-f8d8508dd1e9')
2017-03-22 19:27:03,748 INFO sqlalchemy.engine.base.Engine COMMIT
2017-03-22 19:27:03,753 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2017-03-22 19:27:03,754 INFO sqlalchemy.engine.base.Engine INSERT INTO contracts (name, description, created, methods, api, signatures, user_id) VALUES (?, ?, ?, ?, ?, ?, ?)
2017-03-22 19:27:03,754 INFO sqlalchemy.engine.base.Engine ('DigitalCurrency', '', '2017-03-22 19:27:03.749540', <memory at 0x7f5a4cc7af48>, <memory at 0x7f5a4c3ab048>, <memory at 0x7f5a4c3ab108>, None)
2017-03-22 19:27:03,756 INFO sqlalchemy.engine.base.Engine INSERT INTO status (contract_id, attributes, "key", "when", owner) VALUES (?, ?, ?, ?, ?)
2017-03-22 19:27:03,756 INFO sqlalchemy.engine.base.Engine (1, <memory at 0x7f5a4cc7af48>, <memory at 0x7f5a4c3af048>, '2017-03-22 19:27:03.752994', None)
2017-03-22 19:27:03,756 INFO sqlalchemy.engine.base.Engine COMMIT

It is possible that the issue is due to some fixes to make pyrqlite with Python 3

index fe5a2ba..75fd7dc 100644
--- a/src/pyrqlite/extensions.py
+++ b/src/pyrqlite/extensions.py
@@ -55,7 +55,7 @@ adapters = {
     float: lambda x: x,
     int: lambda x: x,
     bool: int,
-    unicode: lambda x: x.encode('utf-8'),
+    str: lambda x: x.encode('utf-8'),
     type(None): lambda x: None,
     datetime.date: _adapt_date,
     datetime.datetime: _adapt_datetime,
@@ -127,7 +127,7 @@ def _convert_to_python(column_name, type_, value, parse_decltypes=False, parse_c
         if type_upper not in _native_converters:
             value = value.decode('base64')
         value = converter(value)
-    elif isinstance(value, basestring):
+    elif isinstance(value, bytes):
         value = value.decode('base64')

     return value
@@ -137,7 +137,7 @@ def _adapt_from_python(value):
     if not isinstance(value, basestring):
         try:
             adapted = adapters[(type(value), sqlite3.PrepareProtocol)](value)
-        except KeyError, e:
+        except KeyError as e:
             # No adapter registered. Let the object adapt itself via PEP-246.
             # It has been rejected by the BDFL, but is still implemented
             # on stdlib sqlite3 module even on Python 3 !!

What else you need for a useful bug report?

zmedico commented 7 years ago

For testing purposes, I used python2.7 (a1aeeb61a1cb89f31f5fd54518f940f400d7e170 makes sqlalchemy-rqlite compatible).

The problem is triggered since https://github.com/rqlite/pyrqlite/commit/d1889dc441eb5e3a712e5ca96ac1fb5dd3ccd194:

@@ -114,6 +114,7 @@ def execute(self, operation, parameters=None):
             for item in results:
                 if 'error' in item:
                     logging.error(json.dumps(item))
+                    raise Error(json.dumps(item))
                 try:
                     rows_affected += item['rows_affected']
                 except KeyError:

The stricter pyrqlite behavior is apparently correct, because python's own sqlite3 module raises sqlite3.OperationalError in this case.

The real problem is that rqlite does not return any result rows for queries of the form PRAGMA table_info("existing_table"), which causes sqlalchemy to assume that the table does not exist.

For example, here's a test case to show how python's own sqlite3 module responds to PRAGMA tableinfo:

import sys, sqlite3
c = sqlite3.connect(':memory:')
c.execute('CREATE TABLE existing_table (textcolumn text)')
c.commit()
cursor = c.execute('PRAGMA table_info(\"hello_world\")')
sys.stdout.write(repr(cursor.fetchall()) + '\n')
c.close()

The output is as follows:

[(0, 'textcolumn', 'text', 0, None, 0)]

@otoolep, does rqlite have a way to return rows for PRAGMA operations, and if not then how difficult would it be to fit this into the HTTP API?

zmedico commented 7 years ago

In sqlalchemy, the pragma statements are executed via the SQLiteDialect._get_table_pragma method. These pragma statements are used:

otoolep commented 7 years ago

It does seem to work at a low-level: https://github.com/rqlite/rqlite/pull/298

Digging in, to see what can be done.

otoolep commented 7 years ago

OK, it looks fine as long as you send the PRAGMA command to the query endpoint. Is this an issue?

$ ./rqlite 
127.0.0.1:4001> CREATE TABLE foo (id INTEGER NOT NULL PRIMARY KEY, name TEXT)
0 row affected (0.000363 sec)
127.0.0.1:4001> 
EOF (CTRL+D)
$ curl -G 'localhost:4001/db/query?pretty&timings' --data-urlencode 'q=PRAGMA table_info("foo")'
{
    "results": [
        {
            "columns": [
                "cid",
                "name",
                "type",
                "notnull",
                "dflt_value",
                "pk"
            ],
            "types": [
                "",
                "",
                "",
                "",
                "",
                ""
            ],
            "values": [
                [
                    0,
                    "id",
                    "INTEGER",
                    1,
                    null,
                    1
                ],
                [
                    1,
                    "name",
                    "TEXT",
                    0,                                                                                                                                                                    
                    null,                                                                                                                                                                 
                    0                                                                                                                                                                     
                ]                                                                                                                                                                         
            ],                                                                                                                                                                            
            "time": 0.000293474                                                                                                                                                           
        }                                                                                                                                                                                 
    ],                                                                                                                                                                                    
    "time": 0.000331021

$ curl -XPOST 'localhost:4001/db/query?pretty' -H "Content-Type: application/json" -d '[                        
>     "PRAGMA table_info(\"foo\")"                                                                                                                                                        
> ]'                                                                                                                                                                                      
{                                                                                                                                                                                         
    "results": [                                                                                                                                                                          
        {                                                                                                                                                                                 
            "columns": [                                                                                                                                                                  
                "cid",                                                                                                                                                                    
                "name",                                                                                                                                                                   
                "type",                                                                                                                                                                   
                "notnull",                                                                                                                                                                
                "dflt_value",                                                                                                                                                             
                "pk"
            ],
            "types": [
                "",
                "",
                "",
                "",
                "",
                ""
            ],
            "values": [
                [
                    0,
                    "id",
                    "INTEGER",
                    1,
                    null,
                    1
                ],
                [
                    1,
                    "name",
                    "TEXT",
                    0,
                    null,
                    0
                ]
            ]
        }
    ]
}
otoolep commented 7 years ago

Looping in @alanjds since he touched pyrqlite most recently.

@alanjds -- any comment?

zmedico commented 7 years ago

OK, it looks fine as long as you send the PRAGMA command to the query endpoint. Is this an issue?

I got it working with python2.7, using these changes to pyrqlite:

diff --git a/src/pyrqlite/cursors.py b/src/pyrqlite/cursors.py
index 78fa18a..593b995 100644
--- a/src/pyrqlite/cursors.py
+++ b/src/pyrqlite/cursors.py
@@ -95,7 +95,7 @@ class Cursor(object):
             operation = self._substitute_params(operation, parameters)

         command = self._get_sql_command(operation)
-        if command == 'SELECT':
+        if command in ('SELECT', 'PRAGMA'):
             payload = self._request("GET",
                                     "/db/query?" + urlencode({'q': operation}))
         else:
diff --git a/src/pyrqlite/extensions.py b/src/pyrqlite/extensions.py
index fe5a2ba..b119f91 100644
--- a/src/pyrqlite/extensions.py
+++ b/src/pyrqlite/extensions.py
@@ -66,6 +66,7 @@ adapters = {(type_, sqlite3.PrepareProtocol): val for type_, val in adapters.ite
 converters = {
     'TEXT': str,
     'VARCHAR': lambda x: x.encode('utf-8'),
+    'NVARCHAR': lambda x: x.encode('utf-8'),
     'UNICODE': lambda x: x.decode('utf-8'),
     'INTEGER': int,
     'BOOL': bool,
@@ -75,11 +76,12 @@ converters = {
     'BLOB': lambda x: x,
     'DATE': _convert_date,
     'TIMESTAMP': _convert_timestamp,
-    '': lambda x: x.encode('utf-8'),
+    # Allow types such as int to pass through for PRAGMA results
+    '': lambda x: x.encode('utf-8') if hasattr(x, 'encode') else x,
 }

 # Non-native converters will be decoded from base64 before fed into converter
-_native_converters = ('BOOL', 'FLOAT', 'INTEGER', 'REAL', 'NUMBER', 'TEXT', 'VARCHAR', 'NULL', 'DATE', 'TIMESTAMP', '')
+_native_converters = ('BOOL', 'FLOAT', 'INTEGER', 'REAL', 'NUMBER', 'NVARCHAR', 'TEXT', 'VARCHAR', 'NULL', 'DATE', 'TIMESTAMP', '')

 def register_converter(type_string, function):
alanjds commented 7 years ago

Is NVARCHAR a native type for SQLite? If so, should be included in the converters and listed in the _native_converters, as rqlite server will return it without being base64 encoded. Converters transforms RQLite answers to Python native types.

See at line 68:

    'VARCHAR': lambda x: x.encode('utf-8'),

@zmedico You can include the NVARCHAR into the _native_converters list, if it makes sense. Types having a converter and whitelisted there are not converted from base64

# Non-native converters will be decoded from base64 before fed into converter
_native_converters = ('BOOL', 'FLOAT', 'INTEGER', 'REAL', 'NUMBER', 'TEXT', 'VARCHAR', 'NULL', 'DATE', 'TIMESTAMP', '')
zmedico commented 7 years ago

@alanjds, thanks for the info. We should fix it at least to handle the TEXT types in the Affinity Name Examples table on this page:

https://www.sqlite.org/datatype3.html

alanjds commented 7 years ago

Oh. I was not aware of this affinity stuff. Maybe mimic the logic to search for parts of the name :(

I do not like it, but if is what _sqlite.so does...

otoolep commented 7 years ago

@zmedico -- should we patch the code, as you suggested? It's not quite clear to me.

zmedico commented 7 years ago
Oh. I was not aware of this affinity stuff. Maybe mimic the logic to search for parts
of the name :(

I do not like it, but if is what _sqlite.so does...

I'll see about writing a patch for this. I see that rqlite has some related logic here:

// isTextType returns whether the given type has a SQLite text affinity.
// http://www.sqlite.org/datatype3.html
func isTextType(t string) bool {
    return t == "text" ||
        t == "" ||
        strings.HasPrefix(t, "varchar") ||
        strings.HasPrefix(t, "varying character") ||
        strings.HasPrefix(t, "nchar") ||
        strings.HasPrefix(t, "native character") ||
        strings.HasPrefix(t, "nvarchar") ||
        strings.HasPrefix(t, "clob")
}
@zmedico -- should we patch the code, as you suggested? It's not quite clear to me.

https://github.com/rqlite/pyrqlite/pull/11 adds pragma support. The NVARCHAR thing will be addressed separately.

otoolep commented 7 years ago

I'll see about writing a patch for this. I see that rqlite has some related logic here:

I should admit that I don't have a full understanding of the implications of that code (isTextType) in the rqlite source. I may not be doing the right thing -- I just noticed that what I understood to be strings were coming back from as []byte and I just cast them to string so they were readable.

guillemborrell commented 7 years ago

Thanks!

At this moment I am having issues putting binary data in the database with Python 3. I'll try to solve these issues myself, but any piece of advice is welcome.

zmedico commented 7 years ago

At this moment I am having issues putting binary data in the database with Python 3. I'll try to solve these issues myself, but any piece of advice is welcome.

I plan to tackle the Python 3 issues after I've resolved the text affinity issues.

Oh. I was not aware of this affinity stuff. Maybe mimic the logic to search for parts of the name :(

I have an implementation in https://github.com/rqlite/pyrqlite/pull/13 and I think it's not so bad.