xmppo / Jabber-Net

A modern fork of Jabber-Net
https://github.com/xmppo/Jabber-Net
37 stars 13 forks source link

Deadlock #95

Open dsparkplug opened 6 years ago

dsparkplug commented 6 years ago

I've found an circular locking issue with the IStanzaEventListener methods in the XmppStream class .

Basically the IStanzaEventListener.StanzaReceived which acquires a [MethodImpl(MethodImplOptions.Synchronized)] lock can call IStanzaEventListener.BytesWritten which also needs to acquire a [MethodImpl(MethodImplOptions.Synchronized)] lock and therefore deadlocks.

The chain of calls is as follows:

XmppStream.StanzaReceived acquires a MethodImpl(MethodImplOptions.Synchronized) lock --> calls XmppStream.OnElement if (State is not ServerFeaturesState.Instance or SASLState.Instance or StartTLSState.Instance or CompressionState.Instance or SASLAuthedState.Instance) --> invokes XmppStream.OnProtocol event In IQTracker OnProtocol event is set to OnIQ --> calls IQTracker.Call which calls the 'cb' callback which is set to JabberClient.GotSession in JabberClient.GotResource --> sets JabberClient.IsAuthenticated --> calls JabberClient.Presence --> calls JabberClient.Write --> calls SocketStanzaStream.JabberClient.Write --> calls AsyncSocket.Write --> calls Stream.BeginWrite with callback WroteData --> WroteData calls SocketStanzaStream.OnWrite --> calls XmppStream.BytesWritten which waits for a MethodImpl(MethodImplOptions.Synchronized) lock

Any ideas how to go about fixing this? The way this code uses callbacks is so convoluted its hard to know where to start without breaking something else.

The chain of events and subsequent deadlock is reproducible when sending a large number of XMPP messages.

ForNeVeR commented 6 years ago

Alright, that's a fair question.

Do you have any repro scenario? It would be easier to check (or even to write a unit test) when we have a MCVE.

I still didn't understood what thread exactly deadlock here, so I'm not sure how to proceed.

I have an idea to make more things async / threadsafe by default in our code, but first I need to better understand this particular case.

I agree that the code base isn't in the best state now. Hopefully I'll finish some big things in my life in a couple of months and will be able to come back to this library.

KotM commented 6 years ago

In our project we have encountered with similar issues on simultaneous XmppStream.Write() attempts. At the least application waits a response from XmppStream.Write() from several threads. It seems to be a deadlock.

Currently I'll try to update our code by adding simply lock clause around this command, hope it will help.

KotM commented 6 years ago

Unfortunately, locking doesn't help. Currently we're captured XmppStream.Write() hanging again, with lock state. Find below a stack trace (this thread was stopped on XmppStream.Write(), another one on lock state):

    [In a sleep, wait, or join] 
    JabberNet.dll!JabberNet.jabber.connection.XmppStream.JabberNet.jabber.connection.IStanzaEventListener.BytesWritten(byte[] buf = {byte[1437]}, int offset = 0, int count = 1437) + 0x3f bytes    
    JabberNet.dll!JabberNet.bedrock.net.AsyncSocket.WroteData(System.IAsyncResult ar) + 0xbf bytes  
    System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes    
    System.dll!System.Net.ContextAwareResult.CaptureOrComplete(ref System.Threading.ExecutionContext cachedContext, bool returnContext) + 0xfb bytes    
    System.dll!System.Net.ContextAwareResult.FinishPostingAsyncOp(ref System.Net.CallbackClosure closure = {System.Net.CallbackClosure}) + 0x77 bytes   
    System.dll!System.Net.Sockets.Socket.BeginSend(byte[] buffer, int offset, int size, System.Net.Sockets.SocketFlags socketFlags, out System.Net.Sockets.SocketError errorCode, System.AsyncCallback callback, object state) + 0x11c bytes    
    System.dll!System.Net.Sockets.Socket.BeginSend(byte[] buffer, int offset, int size, System.Net.Sockets.SocketFlags socketFlags, System.AsyncCallback callback, object state) + 0x41 bytes   
    System.dll!System.Net.Sockets.NetworkStream.BeginWrite(byte[] buffer, int offset, int size, System.AsyncCallback callback, object state) + 0xa4 bytes   
    JabberNet.dll!JabberNet.bedrock.net.AsyncSocket.Write(byte[] buf, int offset, int len) + 0x162 bytes    
    JabberNet.dll!JabberNet.jabber.connection.SocketStanzaStream.Write(string str) + 0x92 bytes 

[our code].Utilities.Jabber.PracticeAreaChannel.RespondToQuery(JabberNet.jabber.protocol.client.IQ response = Cannot evaluate expression because the current thread is in a sleep, wait, or join) Line 914 + 0x22 bytes C#

[our stack here, removed by security reasons]

    System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes    
    mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x15e bytes 
    mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x17 bytes  
    mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) + 0x52 bytes    
    System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken) + 0x98 bytes 
    System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xcc bytes  
    System.dll!System.Net.HttpWebRequest.ProcessResponse() + 0x13c bytes    
    System.dll!System.Net.HttpWebRequest.SetResponse(System.Net.CoreResponseData coreResponseData) + 0x1b2 bytes    
    System.dll!System.Net.ConnectionReturnResult.SetResponses(System.Net.ConnectionReturnResult returnResult = {System.Net.ConnectionReturnResult}) + 0xbb bytes    
    System.dll!System.Net.Connection.ReadComplete(int bytesRead, System.Net.WebExceptionStatus errorStatus) + 0x450 bytes   
    System.dll!System.Net.Connection.PostReceive() + 0x109 bytes    
    System.dll!System.Net.Connection.ReadComplete(int bytesRead, System.Net.WebExceptionStatus errorStatus) + 0x2d1 bytes   
    System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes    
    System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xcc bytes  
    System.dll!System.Net.Security._SslStream.ProcessFrameBody(int readBytes, byte[] buffer, int offset, int count, System.Net.AsyncProtocolRequest asyncRequest) + 0x14b bytes 
    System.dll!System.Net.Security._SslStream.ReadFrameCallback(System.Net.AsyncProtocolRequest asyncRequest = {System.Net.AsyncProtocolRequest}) + 0xab bytes  
    System.dll!System.Net.AsyncProtocolRequest.CompleteRequest(int result) + 0x3d bytes 
    System.dll!System.Net.FixedSizeReader.CheckCompletionBeforeNextRead(int bytes) + 0x29 bytes 
    System.dll!System.Net.FixedSizeReader.ReadCallback(System.IAsyncResult transportResult) + 0x78 bytes    
    System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes    
    mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x15e bytes 
    mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x17 bytes  
    mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) + 0x52 bytes    
    System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken) + 0x98 bytes 
    System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xcc bytes  
    System.dll!System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped) + 0x13a bytes    
    mscorlib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* pOVERLAP) + 0x7d bytes    
    [Native to Managed Transition]
ForNeVeR commented 6 years ago

Alright, this is one thread. Could you please show stack for other thread (the one holding the lock)?

KotM commented 6 years ago

I was not aware that second stack trace being useful taking into account that it was stopped on our lock, but I forgot about preceding code... So, please find it below, too:

    [In a sleep, wait, or join] 

    [our code].Utilities.Jabber.PracticeAreaChannel.RespondToQuery(JabberNet.jabber.protocol.client.IQ response = Cannot evaluate expression because the current thread is in a sleep, wait, or join) Line 912 + 0x19 bytes C#

    [our stack here, removed by security reasons]

    JabberNet.dll!JabberNet.jabber.client.JabberClient.FireOnIQ(object sender, JabberNet.jabber.protocol.client.IQ iq = Cannot evaluate expression because the current thread is in a sleep, wait, or join) + 0x2f bytes    
    JabberNet.dll!JabberNet.jabber.connection.XmppStream.JabberNet.jabber.connection.IStanzaEventListener.StanzaReceived(System.Xml.XmlElement elem) + 0x48 bytes   
    JabberNet.dll!JabberNet.jabber.protocol.ElementStream.FireOnElement(System.Xml.XmlElement elem) + 0x2b bytes    
    JabberNet.dll!JabberNet.jabber.protocol.AsynchElementStream.EndTag(byte[] buf, int offset, JabberNet.xpnet.ContentToken ct, JabberNet.xpnet.TOK tok) + 0x134 bytes  
    JabberNet.dll!JabberNet.jabber.protocol.AsynchElementStream.Push(byte[] buf = {byte[4096]}, int offset = 0, int length = 1275) + 0x19c bytes    
    JabberNet.dll!JabberNet.jabber.connection.SocketStanzaStream.JabberNet.bedrock.net.ISocketEventListener.OnRead(JabberNet.bedrock.net.BaseSocket sock = {JabberNet.bedrock.net.AsyncSocket}, byte[] buf, int offset, int length) + 0xbc bytes    
    JabberNet.dll!JabberNet.bedrock.net.AsyncSocket.GotData(System.IAsyncResult ar) + 0x9e bytes    
    System.dll!System.Net.LazyAsyncResult.Complete(System.IntPtr userToken) + 0x52 bytes    
    mscorlib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x15e bytes 
    mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state, bool preserveSyncCtx) + 0x17 bytes  
    mscorlib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) + 0x52 bytes    
    System.dll!System.Net.ContextAwareResult.Complete(System.IntPtr userToken) + 0x98 bytes 
    System.dll!System.Net.LazyAsyncResult.ProtectedInvokeCallback(object result, System.IntPtr userToken) + 0xcc bytes  
    System.dll!System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped) + 0x13a bytes    
    mscorlib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* pOVERLAP) + 0x7d bytes    
    [Native to Managed Transition]  
dsparkplug commented 6 years ago

Sorry for the delay - other priorities etc.

Here's a MCVE which sends 10000 test messages. The deadlock will generally occur after a few thousand messages with several threads stuck at AsyncSocket.Write and one stuck at XmppStream.BytesWritten.

Note that this is only appears to happen with AutoPresence = true . Also, I wouldn't normally create a new client and authenticate for each message as in this example, but the authentication seems to involved in causing the deadlock.

XMPP server is OpenFire 4.2.3


using JabberNet.bedrock;
using JabberNet.jabber;
using JabberNet.jabber.client;
using JabberNet.jabber.connection;
using JabberNet.jabber.protocol;
using System;
using System.Threading;
using System.Xml;

namespace JabberTest
{
    public class Program
    {

        public static void Main(string[] args)
        {

            string server = "servername";
            string xmppID = "sparky";
            int port = 5222;
            string password = "#######";

            XmppSender xmppSender = new XmppSender();
            int sent = 0;
            for (var i = 0; i < 10000; i++)
            {

                if (!xmppSender.SendMessage(server, xmppID, port, password))
                {
                    Console.WriteLine($"Failed - sent {sent} messages");
                    return;
                }
                sent++;
            }
            Console.WriteLine($"Success - sent {sent} messages"); ;
        }

    }
    public class XmppSender
    {
        private ManualResetEvent _authenticatedEvent;

        public XmppSender()
        {
            _authenticatedEvent = new ManualResetEvent(false);

        }

        private void client_OnAuthenticate(object sender)
        {
            Console.WriteLine("Authenticated");
            _authenticatedEvent.Set();
        }

        private void client_OnConnect(object sender, StanzaStream stream)
        {
            Console.WriteLine("Connected");
        }
        private void client_OnDisconnect(object sender)
        {
            Console.WriteLine("Disconnected");
            _authenticatedEvent.Reset();
        }

        private void client_OnError(object sender, Exception ex)
        {
            Console.WriteLine($"The following error occurred while publishing an XMPP notification:\n{ex}");
            _authenticatedEvent.Set();
        }

        private void client_OnStreamError(object sender, XmlElement rp)
        {
            string info = rp == null ? "No further details" : rp.OuterXml;
            Console.WriteLine($"The following stream error occurred while publishing an XMPP notification:\n{info}");
            _authenticatedEvent.Set();
        }

        private void client_OnAuthError(object sender, XmlElement elem)
        {
            string info = elem == null ? "No further details" : elem.OuterXml;
            Console.WriteLine($"The following authentication error occurred while publishing an XMPP notification:\n{info}");
            _authenticatedEvent.Set();
        }

        public bool SendMessage(string server, string xmppID, int port, string password)
        {

            var jId = JID.Escape(xmppID, server, "Test");

            JabberClient client = new JabberClient();

            client.AutoReconnect = 3f;
            client.User = jId.User;
            client.Server = jId.Server;
            client.NetworkHost = jId.Server;
            client.Port = port;
            client.Resource = jId.Resource;
            client.Password = password;

            client.AutoPresence = true; //note that a deadlock can occur if this is set to true
            client.AutoRoster = false;
            client.SSL = false;
            client.AutoReconnect = 3;
            client.PlaintextAuth = true;
            client.AutoStartCompression = true;
            client.AutoStartTLS = false;
            client.AutoLogin = true;

            client.OnError += new ExceptionHandler(client_OnError);
            client.OnStreamError += new ProtocolHandler(client_OnStreamError);
            client.OnAuthenticate += new ObjectHandler(client_OnAuthenticate);
            client.OnDisconnect += new ObjectHandler(client_OnDisconnect);
            client.OnAuthError += new ProtocolHandler(client_OnAuthError);
            client.OnConnect += new StanzaStreamHandler(client_OnConnect);

            try
            {
                if (!client.IsAuthenticated)
                {
                    _authenticatedEvent.Reset();

                    if (!_authenticatedEvent.WaitOne(0))
                        client.Connect();

                    if (!_authenticatedEvent.WaitOne(10000))
                        return false;
                }
                if (!client.IsAuthenticated)
                    return false;

                jId.Resource = client.Resource;

                Console.WriteLine($"Sending message"); ;
                client.Message(jId.BareJID, "test message");
            }
            finally
            {
                client.Close();

                client.OnError -= client_OnError;
                client.OnStreamError -= client_OnStreamError;
                client.OnAuthenticate -= client_OnAuthenticate;
                client.OnConnect -= client_OnConnect;
                client.OnDisconnect -= client_OnDisconnect;
                client.OnAuthError -= client_OnAuthError;

            }

            return true;
        }
    }
}
KotM commented 6 years ago

Any progress here?

And how dangerous to use AutoPresence = false?

ForNeVeR commented 6 years ago

Sorry, I was preoccupied with other great deeds, so wasn't able to participate in this case. Honestly I think that you, guys, have better opportunity to fix the issue if you have a repro case.

Although, I will now do my best to research the case myself and fix it. I have some spare time and I'm on it right now.

ForNeVeR commented 6 years ago

@KotM, AutoPresence = false is pretty safe (at least if you don't need this functionality actually), and it seems like it reduces the amount of deadlocks in the code.

I was able to reproduce the issue, although, sometimes it happens even with AutoPresence = false. Looks like to terminate the issue completely we'll need pretty heavy rearchitecture of the JabberClient part of the library :(

ForNeVeR commented 6 years ago

For now, I'm thinking about replacing the JabberClient internals with some actor system such as Akka.NET that should solve the problems (partially, at least). Event dispatching should be asynchronous and without manual locking; that way we'll probably defeat the deadlocks.