crate / crate-python

Python DB API client library for CrateDB, using HTTP.
https://cratedb.com/docs/python/
Apache License 2.0
79 stars 30 forks source link

`commit()` seems not working #590

Closed YuanchengJiang closed 11 months ago

YuanchengJiang commented 11 months ago

After inserting data into the table, it requires some time to sync up even with the commit() function. What shall I do if I would like a linear output from the following script?

from crate import client
connection = client.connect("localhost:4200", username="crate")

def write_query(query):
    cursor = connection.cursor()
    cursor.execute(query)
    cursor.close()
    connection.commit()
    return None

def query(query):
    cursor = connection.cursor()
    cursor.execute(query)
    results = cursor.fetchall()
    for i in range(len(results)):
        results[i] = tuple(results[i])
    cursor.close()
    return results

write_query("CREATE TABLE test (c_0 int)")
while True:
    write_query("INSERT INTO test values (0)")
    print(query("SELECT count(*) FROM test"))

output:

[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
[(0,)]
...
[(180,)]
...
YuanchengJiang commented 11 months ago

Adding a time.sleep(2) in write_query can have linear output

seut commented 11 months ago

As CrateDB does not support transaction so commit() won't do anything. CrateDB is eventual consistent, such writes are not immediately visible to select queries. One exception is primary-key lookup queries like SELECT ... WHERE pk_col = <val>, these queries will always return the most-recent values. One can force writes to be visible immediately by issuing a REFRESH TABLE <table_name>;

As this isn't a cheap operation, it will slow down write throughput a lot when issued after every write. By default CrateDB will refresh in the background every 1sec, see also https://cratedb.com/docs/crate/reference/en/5.4/sql/statements/create-table.html#refresh-interval.