fritzy / SleekXMPP

Python 2.6+/3.1+ XMPP Library
http://groups.google.com/group/sleekxmpp-discussion
Other
1.1k stars 299 forks source link

handler threads don't always exit #153

Open coffeeowl opened 12 years ago

coffeeowl commented 12 years ago

If connection with XMPP server was lost (even if it was clean server shutdown) event handling thread doesn't exit if reconnect attempt was made right after disconnect event. The problem is in while loop https://github.com/fritzy/SleekXMPP/blob/master/sleekxmpp/xmlstream/xmlstream.py#L1311

    try:
        while not self.stop.is_set():
            try:
                wait = self.wait_timeout
                event = self.event_queue.get(True, timeout=wait)
            except queue.Empty:
                event = None
            if event is None:
                continue

due to timeout, self.stop can be cleared before event_queue times out and when while makes new iteration self.stop is already cleared again.

DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events ERROR:sleekxmpp.xmlstream.xmlstream:Error reading from XML stream. ERROR:sleekxmpp.basexmpp:no element found: line 1, column 734 Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/sleekxmpp/xmlstream/xmlstream.py", line 1159, in _process if not self.__read_xml(): File "/usr/local/lib/python2.7/dist-packages/sleekxmpp/xmlstream/xmlstream.py", line 1195, in __read_xml for event, xml in ET.iterparse(self.filesocket, (b'end', b'start')): File "", line 107, in next ParseError: no element found: line 1, column 734 DEBUG:sleekxmpp.xmlstream.xmlstream:SEND (IMMED): /stream:stream DEBUG:sleekxmpp.xmlstream.xmlstream:Waiting for /stream:stream from server DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! auto_reconnect: False DEBUG:sleekxmpp.xmlstream.xmlstream:!!! stop set DEBUG:sleekxmpp.xmlstream.xmlstream:!!! stop - True DEBUG:sleekxmpp.thirdparty.statemachine: ==== TRANSITION connected -> disconnected DEBUG:sleekxmpp.xmlstream.xmlstream:!!!!!!! stop cleared !!!!!!!! DEBUG:sleekxmpp.xmlstream.xmlstream:Trying to connect to 127.0.0.1:5222 DEBUG:sleekxmpp.xmlstream.xmlstream:Waiting 2.01526779831 seconds before connecting. DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! waiting for events DEBUG:sleekxmpp.xmlstream.xmlstream:!!! no events

Also this looks strange https://github.com/fritzy/SleekXMPP/blob/master/sleekxmpp/xmlstream/xmlstream.py#L1373

        while not self.stop.is_set():
            while not self.stop.is_set and \
                  not self.session_started_event.is_set():

is_set as a property?

legastero commented 12 years ago

As for the strangeness with is_set, that has already been fixed in the develop branch.

legastero commented 12 years ago

Would you mind sharing what your main app loop looks like with how you're handling connections/reconnects, etc.

It seems from your recent issues that it is just different enough from the way we write Sleek-based apps here that our testing process isn't catching these kinds of bugs.

coffeeowl commented 12 years ago

It is nothing special, simple loop with connect -> process:

while node.run:
    try:
        if node.connect(reattempt=False):
            node.process(threaded=False)
        else:
            raise Exception("connect() failed")
    except Exception, e:
        time.sleep(15)

remove_pid()
sys.exit(0)

Everything else before "while" is code for daemonization, privilege dropping, settings and signal handling and so on.

Maybe I should decrease upper bound for timeout in SleekXMPP and let it handle reconnects... or at least wait a second before calling connect() after process() exits. But anyway, I guess my code is legal, the functions are public and I call them in right order.

legastero commented 12 years ago

That is certainly legal code that ought to work. As a temporary measure, adding in a two second delay between connection attempts should be plenty to ensure the threads detect the stop signal.

I'm currently investigating keeping a thread count and not letting disconnect() return until all threads have exited, which effectively would add that two second delay to the disconnect() call.

duja commented 12 years ago

Is there any progress on this issue?

legastero commented 12 years ago

Develop branch now tracks the main threads and waits for them to exit before returning from disconnect().

ghost commented 7 years ago

We have the problem that XMPPClient.disconnect() does not return to True as expected and other threads are still running in the background. Is there a possibility to forcefully stop these blocking threads?