WebReflection / dblite

sqlite for node.js without gyp problems
MIT License
209 stars 34 forks source link

error caused by ABORT renders db unresponsive forever #23

Open dronus opened 10 years ago

dronus commented 10 years ago

Encountered with node.js v0.10.25:

If a column has an UNIQUE constraint with the ON CONFLICT ABORT rule, an attempt to insert a non-unique value is correctly answered by an error:

"Error: near line 1: column email is not unique"

However, db is fucked up afterwards, and no callbacks will ever be called by db.query again, tearing down all db-dependent querys to node.js. I will investigate this further.

dronus commented 10 years ago

The error message in this case also is leaky: If no handler is attached to the UNIQUE-violating INSERT query, the error fires up on some other query, eg. SELECT, which of course can not return this error by itself.

WebReflection commented 10 years ago

this looks pretty bad … in order to help ASAP I wonder if you could:

Thanks a lot!

dronus commented 10 years ago

Using SQLite version 3.7.9 .

dronus commented 10 years ago

This is a timing issue it seams: The 'INSERT INTO' db.query call invokes the callback before the query has failed. If the callback just prints 'err' to the console, one can see 'err' is null, while another error message pops up on the console shortly after, if not captured by an subsequent abitrary db.query fast enough it seems. Found that with more simple test code.

However, under certain conditions not really clear to me (production code), the callback actually gets the error message right, but then db.query would not call an callback anymore forever.

dronus commented 10 years ago

Clearly a synchronisation issue, subsequent runs of the same code in a fresh node showing different outcomes, sometimes capturing the err, sometimes not.

dronus commented 10 years ago

simple example that produces different, buggy outputs on repeated runs:

test.sqlite dump:

CREATE TABLE users (id INTEGER PRIMARY KEY ON CONFLICT REPLACE, email TEXT UNIQUE ON CONFLICT ABORT);
INSERT INTO "users" VALUES(1,'foo@example.com');
INSERT INTO "users" VALUES(2,'bar@example.com');

test.js:

var dblite  = require('dblite' );

var db=dblite('test.sqlite');

db.query('INSERT INTO users (id,email) VALUES(?,?)',['1','bar@example.com'],function(err,data){
    console.log('E:'+err+' D:'+data);
});

db.query('SELECT * FROM users WHERE id=1',function(err,data){
    console.log('E:'+err+' D:'+data);
});
dronus commented 10 years ago

As my sqlite3 was quite old, I managed to install 3.8.3.1. The error message changed, the erratic behaviour persists. OS is Ubuntu 12.04 x64.

WebReflection commented 10 years ago

I've quickly done some test and it looks to me the callback is consistently fired as expected when the error occur, and such error is reported.

I hav updated the test file and when an error occur it is passed through the callback indeed.

I wonder if you are expecting queries to keep running even if there are errors in the database 'cause it looks to me that if you don't care about errors you should not create tables able to abort or you should use transactions.

A way to solve your issue with such structure would be:

db.query(
  'INSERT INTO users (id,email) VALUES(?,?)', 
  [1, 'bar@example.com'],
  function(err,data){
    if (err) {
      return console.error(err);
    }
    db.query('SELECT * FROM users WHERE id=1',function(err,data){
      console.log(data);
    });
  }
);

Or you could use a queue system and when errors occur you decide what to do with a .next() call in such queue.

Does this answer/help?

WebReflection commented 10 years ago

another example would be:

db
  .on('error', function(err) {
    console.log('something terrible happened, rebooting', err);
    process.exit();
  })
  .query('INSERT INTO users (id,email) VALUES(?,?)', [1, 'bar@example.com']);
  .query('SELECT * FROM users WHERE id=?', [1], function(err, data){
    console.log(data);
  });

in this case you can care about errors in a single point, the on("error") callback … long story short: you need to nest callbacks if there is a risk of a failure from the DB or you have a database in an inconsistent state with possibly corrupted data.

I'd rather prefer a silent pause than a eerie of queries that keeps running believing everything happened before was OK, you know what I mean?

Please do not hesitate to ask me more, if necessary. Thanks

WebReflection commented 10 years ago

FYI .. it looks like Travis CI is having similar issue with node 0.8, but green lights on node 0.10

dronus commented 10 years ago

I don't think the problem should be intercepted by error handlers, as it is not caused by sqlite3 errors itself, nor the content or errors of the database. It triggers even with my small example database stated above, which is NOT modified in the test case, so keeps the same every try. The problem however, is triggered again and again, if the test case script is invoked repeatedly.

If the problem occurs, the complete output of my test code is:

> node test.js
E:Error: Error: near line 1: UNIQUE constraint failed: users.email
 D:null

where we completely miss the SELECT query output, because db.query is frozen like I first reported. It can never be used again, no callback will be called.

But sometimes I encounter:

> node test.js
E:null D:undefined
E:Error: Error: near line 1: UNIQUE constraint failed: users.email D:null

Here we see the INSERT query complete, but don't report the ABORT error. The error is then caught by the finished SELECT query, rendering data to null and reporting the error in the wrong place. db.query is not frozen and will do further queries in this case.

Question is if you can reproduce this behaviour, and if not, if this is tied to the used OS / libc or whatever?

dronus commented 10 years ago

a small thought: If we talk about the query callback err, we do not talk about an error in sense of some code problem, but about a condition that can be used for reporting ON CONFLICT constraint clauses like ABORT. It just may indicate some legit condition, such as something is already there and do not need to be stored again.

dronus commented 10 years ago

This is a hard one! Actually, your new test works fine for me. However, the following self-contained code does not:

var dblite  = require('dblite' );

var db=dblite('/tmp/test.sqlite');

db
    .query('DROP TABLE users;')
    .query('CREATE TABLE users (id INTEGER PRIMARY KEY ON CONFLICT REPLACE, email TEXT UNIQUE ON CONFLICT ABORT)')
    .query('INSERT INTO users VALUES(?, ?)', [1, 'foo@example.com'])
    .query('INSERT INTO users VALUES(?, ?)', [2, 'bar@example.com'])
    .query('INSERT INTO users VALUES(?,?)',[1,'bar@example.com'],
        function(err,data){
            console.log('E:'+err+' D:'+data);
        }
    )
    .query('SELECT * FROM users WHERE id=?',[1],
        function(err,data){
            console.log('E:'+err+' D:'+data);
        }
    );

I would expect two console outputs like 'E: .. D: ..' from it. However, only one is returned on all machines I tested on. Can you please try this too?

dronus commented 10 years ago

The problem was introduced by commit 939ec0c57e930b8ce1ab745230a527b1e413cea5 . Prior version is working as expected!

dronus commented 10 years ago

What may going on: stderr and stdout of the sqlite process are not synchronized by any mechanism. If sqlite purges both pipes without much delay inbetween, it is upon kernel buffering and scheduling which fd blocking will release first, and thus you cannot rely on the exact relative order of stdout and stderr output.

If two queries are issued at once, the stdout and stderr callbacks may fire intertweened for the both queries. Query 1's stderr may be delayed after Query 1's stdout. Thus query one would miss the error message, and query two would receive wasError, discarding it's output and killing it's callback. For query two we now would neither fire an error nor data callback.

WebReflection commented 10 years ago

OK, first of all it will take some time to find out a good solution here since it's inconsistently reproducible in different envs … but I want to find the best compromise.

The word compromise is the key, what you don't consider an error is an error to me.

The sqlite is writing out there was an error, and while you think it was not relevant because of some data, I insist it does not matter if the whole raw was inserted already … maybe in some other case one single field, the password one, was the one different but everything else duplicated so sqlite decided to reject, and the user, forever, logged out.

The current version of SQLite is queueing callbacks … I need to triple check the order might be casual but AFAIK every single query is executed once the previous query has been finished.

The spawned process is asynchronous though, and it might decide to trigger the data event too late, hence the problem I believe you are facing.

Once I'll be sure that everything is working as meant or once I've found where the logic is weak, I'll update this and fix the library or ask you to do extra tests.

Right now, with less fancy table structures and more appropriate error checks, dblite has done a very good job but if you want to be hooked 1 to 1 with sqlite, then maybe, you should also consider the gyp based node-sqlite module.

Thanks in any case for your report, appreciated.

dronus commented 10 years ago

"The word compromise is the key, what you don't consider an error is an error to me.", well it also is named error by dblite and send to 'stderr' by sqlite3. On the other hand, UNIQUE indicies are of course of legit use to prevent double data on everyday use. So my example prevents two users entereing the same email address for identification. Of course, we can check the exitence before, but as this is not race condition safe, it is obviuosly a good solution to use the ON CONFLICT ABORT clause to keep the data correct and inform the host application the user entered an already used email. Maybe we cannot use this feature with dblite because the sqlite3 cmd line interface was not meant for such programmatical use, and the two-streams-problem is a show stopper for the ON CONFLICT usage. On the other hand, it would be cool if this could be solved, as dblite is by concept the sqlite wrapper I like most.

WebReflection commented 10 years ago

unfortunately reading special SQL syntax is not an option .. I really don't want to put a whole SQLite parser in here 'cause that would not follow the dblite KISS approach so no way I can trust any REGEXP against a generic 'ON CONFLICT' string, unless this has the strongest rule EVER I can trust via basic RegExp and flag dblite as "expected error, keep going" through such check.

Once again, I will find a solution or, at least, I will make the output consistent no matter how slow is one env compared to another.

dronus commented 10 years ago

Maybe you could merge the two output streams while invoking the sqlite3 process. They are then less prone to go out of sync, and arrive at one node.js callback just bound to stdout. If an apropriate sqlite3 output format is chosen, it should be possible to discern result rows from error messages, and handle them both the right way. Have you been able to reproduce the problem I encountered?

dronus commented 10 years ago

While the race condition itself was and is still there, it seems to be not exploited by the older version I mentioned above.

After rewinding before the commit mentioned, I experienced no issues since, while the current version would crash my application predictable at some code locations.

What problems where exactly adressed by commit 939ec0c57e930b8ce1ab745230a527b1e413cea5 ? Would it be ok using a previous version, or may I encounter another critical issue, that was resolved by this commit?

WebReflection commented 10 years ago

I don't have race conditions and I am unable to reproduce them everywhere I've tried :-/

Anyway, the older version is not compatible with callbacks with two arguments, so it won't notify or receive an error unless you set the error handler.

If you think this is what you want, silently ignore errors then you might use that version but there are other things going on there I would suggest to either help with a patch or wait until I find a good solution.

Best Regards

dronus commented 10 years ago

The version just before the commit mentioned above, I just use an function(err,data) callback, and err actually is set if ON CONFLICT triggers. That let me decide if the row inserted violated the ON CONFLICT ABORT clause. So it works as intended!

The version I use is a921a71aaf451caa19942aa268254491dc8e29cf , which already has the two parameter callback support, but does not have the 'avoided double query call on handled errors' the next commit provides.

I do not encounter issues of 'double query call's as far as it seems. What may happen? How may I trigger this problem?

dronus commented 10 years ago

Hi again. Can you please give me an example what kind of problem was adressed by commit 939ec0c57e930b8ce1ab745230a527b1e413cea5 ? I've used prior versions a lot now, and as it is not prone to the bug discussed in this issue, I would like to keep this variant of the code. Of course, I don't like surprises and like to understand what was fixed by the commit mentioned?

WebReflection commented 10 years ago

What's fixed there is that the err you receive in the callback will be eventually an Error with its message property telling you what kind of error it was, instead of a stream object without a .message property.

dronus commented 7 years ago

Hi, can you please give some comment on closing this bug? I've just run my last test code as stated above, and the problem still persits for dblite as of today on an Ubuntu 16.04 system.

My production system is still tied to a modified version before https://github.com/WebReflection/dblite/commit/939ec0c57e930b8ce1ab745230a527b1e413cea5 .

WebReflection commented 7 years ago

sorry @dronus , I thought it was resolved since 2014 and closed it during some gardening.

I'll keep it open and try to find out time to fix this.

dronus commented 7 years ago

Maybe it is o.k. to close it as won't-fix as it seem a corner case no one trips on but me. However, the docs could hint to this in some way, eg. stating that leveraging ON CONFLICT .. ABORT by using an error handler should not be used. However, the underlying synchronisation issue gives space for a large amount of subtle errors and glitches that may affect many other operations too.

WebReflection commented 7 years ago

naaah, I rather give it another try whenever I have time. I don't like to cause unresponsive-forever situations 😄

dronus commented 7 years ago

I think the comment https://github.com/WebReflection/dblite/issues/23#issuecomment-34946397 still gives a precise description of the bug's cause.

WebReflection commented 7 years ago

I just need to find time to investigate but thanks for pointing me to the right direction.

WebReflection commented 6 years ago

please let me know if v0.10 is working for this case.

cheng1999 commented 5 years ago

Seems still not working for this case. I am facing similar problem here.

WebReflection commented 4 years ago

not sure avoid detaching the process fixed this, so if anyone would be so kind to confirm it, maybe I can close this issue.