piskvorky / sqlitedict

Persistent dict, backed by sqlite3 and pickle, multithread-safe.
Apache License 2.0
1.17k stars 131 forks source link

program without close does not stop #46

Closed grizlupo closed 8 years ago

grizlupo commented 8 years ago

Python 3.5.1 Windows 10 sqlitedict 1.4.0

t.py:

from sqlitedict import SqliteDict
db = SqliteDict('test.db', autocommit=True)
#db.close()
print('end')
    def __del__(self):
        # like close(), but assume globals are gone by now (do not log!)
        self.close(do_log=False)  # <- never return

i guess thread is daemon. when del silently already thread has stopped never process event in queue

tmylk commented 8 years ago

Thanks for reporting this. How do you suggest it can be reproduced?

tmylk commented 8 years ago

Ah I see - thanks for providing t.py. Do you run t.py in a daemon mode?

grizlupo commented 8 years ago

thread that i say is SqliteMultithread (self.setDaemon(True) # python2.5-compatible)

grizlupo commented 8 years ago

my suggestion

80c80
<     from queue import Queue
---
>     from queue import Queue, Empty
82c82
<     from Queue import Queue
---
>     from Queue import Queue, Empty
260,265c260,265
<             if self.conn.autocommit:
<                 # typically calls to commit are non-blocking when autocommit is
<                 # used.  However, we need to block on close() to ensure any
<                 # awaiting exceptions are handled and that all data is
<                 # persisted to disk before returning.
<                 self.conn.commit(blocking=True)
---
>             #if self.conn.autocommit:
>             #    # typically calls to commit are non-blocking when autocommit is
>             #    # used.  However, we need to block on close() to ensure any
>             #    # awaiting exceptions are handled and that all data is
>             #    # persisted to disk before returning.
>             #    self.conn.commit(blocking=True)
316,318d315
<         self.start()
<
<     def run(self):
326a324,325
>         self.conn = conn
>         self.start()
328c327,329
<         res = None
---
>     def run(self, block=True):
>         cursor = self.conn.cursor()
>         self.handled = False
330c331,335
<             req, arg, res, outer_stack = self.reqs.get()
---
>             try:
>                 req, arg, res, outer_stack = self.reqs.get(block)
>             except Empty:
>                 break
>             self.handled = True
332d336
<                 assert res, ('--close-- without return queue', res)
335c339
<                 conn.commit()
---
>                 self.conn.commit()
374,378c378
<                     conn.commit()
<
<         self.log.debug('received: %s, send: --no more--', req)
<         conn.close()
<         res.put('--no more--')
---
>                     self.conn.commit()
465,466c465,472
<         self.select_one('--close--')
<         self.join()
---
>         self.handled = False
>         self.execute('--close--')
>         self.join(0.1)
>         if self.handled and self.is_alive():
>             self.join()
>         if self.reqs.qsize() > 0:
>             self.run(block=False)  # execute rest request
>         self.conn.close()
tmylk commented 8 years ago

@grizlupo I am having trouble reproducing the issue. Could you please help?

I run t.py and it just prints 'end'. Is it the right way?

$ python t.py end

grizlupo commented 8 years ago

I noticed that sqlitedict internally makes one thread for safety in case of multithread. And all requests are going to this thread to process through queue. Because this internal thread is daemon, it should stop when main thread is stopped. Here's a problem. Internal thread already stopped when main thread is close stage without explicitly saying close. I thought main thread should stop first, and than daemon is dead. However, del handles close procedure, which is already gone. Because internal thread is not exist, there is no response. So my suggestion above there is to add some trick to check whether internal thread is dead or alive when handles close procedure, and if it stopped, main thread handles rest of requests and exit.

tmylk commented 8 years ago

Do I understand it correctly that the program t.py just hangs?

ecederstrand commented 8 years ago

I'm having the same issue:

$ cat t.py
from sqlitedict import SqliteDict
db = SqliteDict('test.db', autocommit=True)
#db.close()
print('end')
$ python --version
Python 3.4.3
$ python t.py
end
^C
$

I need to Ctrl-C after 'end' is printed to actually stop the command. Makes no difference if autocommit is True or False. But if I add del db.conn at the end, the program stops.

So, apparently the reference that SqliteDict has to the SqliteMultithread keeps the thread from being garbage collected at program end, even though the thread is daemonic.

tmylk commented 8 years ago

Thanks for reporting this. I could reproduce in 3.4 but not in 2.7. When do you suggests is best to clear that reference?

ecederstrand commented 8 years ago

I seem to have tracked this down to self.select_one('--close--') never returning, in SqliteMultithread.close(). The result Queue never gets populated. Does that ring a bell? I'll keep looking.

ecederstrand commented 8 years ago

Ah. self.select_one('--close--') never returns because run() is no longer running (the thread is being SIGKILL'ed). You can't try-catch a SIGKILL in a daemon thread AFAIK, so close() can't know if the main loop is still running because run() can't do what it does in a normal shutdown. Hmmm.

tmylk commented 8 years ago

I am really puzzled by different behaviour in python 2 vs 3

ecederstrand commented 8 years ago

Changing self.select_one('--close--') to self.reqs.put('--close--', None, None, None) fixes the issue for me since it doesn't depend on run() still looping, but does stop the loop if it is. Would that suffice? I've only done very light testing, not with long queues of long-running queries.

tmylk commented 8 years ago

@ecederstrand Do you think it is worth reporting this bug to Python as it only appears in Python 3?

ecederstrand commented 8 years ago

I don't think this is a Python bug, rather a subtle change in the way threads are garbage-collected or which tasks are allowed to block at program exit. In my understanding, a daemon thread should not have many expectations about what is still functioning when it is being killed. Either way, the inner works of Python threading are over my head, and a bugfix is not realistically going to land on any of my boxes for at least a year.

I have updated my PR (https://github.com/RaRe-Technologies/sqlitedict/pull/55) to fix the issue on program exit while keeping the original behaviour in all other cases. What do you think?

ecederstrand commented 8 years ago

PR #55 doesn't fix all issues. I still have problems with indefinite hangs on process exit if sqlitedict is used within a multiprocessing.pool.ThreadPool. Unfortunately, I don't have time to investigate further ATM.

piskvorky commented 8 years ago

@ecederstrand @tmylk When would we want force=False or force=True? What are the benefits of one or the other?

Needs better docs.

ecederstrand commented 8 years ago

The force thing was only ever meant to be used internally.force=True is only used in __del__ which can get called explicitly by user (via del my_dict) or implicitly by garbage collection. In both situations, there should be low expectations about what happens to any ongoing or queued queries, so it's ok to force quit.

Proving that a program exits is one of the grand problems of CS so I'm unsure how to write a test for this bug 😀 Plus, there's still the issue with threadpools mentioned above.