Bad Performance - Githubissues

JanEggers commented 7 years ago

Hi there im currently evaluating your library.

I try to send as much packages as possible but the server seems to be only capable of handling 500 msgs/ sec. cpu is at 100% and most time is wasted in mqttpacketreader according to redgate perf profiler.

is this expected perf? i first thought i need a better json serializer library but json.net is just at less than 1 % of cpu.

rydergillen-compacSort commented 7 years ago

Is the test being performed on a single machine? If so, the complete lack of latency is likely making the Async code perform poorly. Many of the methods marked as Async are likely returning near instantly; meaning the cost of context tracking/switching outweighs the benefit.

As an example look at the following method that reads 1 byte. The overhead of tracking/switching thread contexts likely outweighs the performance gain. However this is somewhat subjective, high latency can be mitigated by Async

private async Task<byte> ReadStreamByteAsync() { var buffer = new byte[1]; await ReadFromSourceAsync(buffer); return buffer[0]; }

UPDATE:

I ran 5 quick tests of a 10000 message loop. The original Async code averages ~25 seconds; while Async-less (removed all async/task from MqttPacketReader.cs) code ~15 seconds.

JanEggers commented 7 years ago

thx for the update. ill do some testing in what is required to improve perf but i need some serious gain. i need to process at least 50k msg per second. ill start by looking at kestrel. if the asp.net stuff can handle 2M requests in plaintext benchmark it should be possible to improve with this protocol.

overuse of async is one problem for shure, but another problem is bufferhandling...

1iveowl commented 7 years ago

Not sure how much it'll change, but it should be safe to use ConfigureAwait(false).

For details see Async/Await - Best Practices in Asynchronous Programming

I've added some in PR #32

JanEggers commented 7 years ago

i would remove most async awaits as @rydergillen-compacSort suggested. reading data from the socket async makes tokal sense. but afterwards reading single values async from memory stream is wasted time.

JanEggers commented 7 years ago

i removed async and found some more bugs.

should i create a pr for each bug or is it ok if i include bugfixes in my perf boost pr?

as my next step is a litle more intrusive i first want to check on your oppinions.

i need to change interfaces so they return observables. ICommunicationChannel.ReceivePacketAsync should return iobservable<byte[]> and IMqttCommunicationAdapter should return iobservable<MqttBasePacket>.

with that interface change i can reuse the packetreader. currently a new reader and more important a new buffer is created for each message which can be improved.

JanEggers commented 7 years ago

and is it ok to bring in rx for implementation or should i use handmade observables

JanEggers commented 7 years ago

i would love to use System.IO.Pipelines that is what kestrel will be using but that not released yet. mehhh

chkr1011 commented 7 years ago

@JanEggers Thank you for reporting this behavior. Do you have maybe more details from the performance profiler?

@JanEggers I don't want to add depending libraries to this project. One goal is to be independent without having the overhead of other libraries (not performance related, more the organizational overhead). But I assume we can optimize the performance with several other tweaks.

Please keep in mind that several classes are used at the server too. So the packet reader is designed to switch the protocol version. This means after the connect packet the protocol is choosen and the packet reader takes care to use the proper deserializion code. If you want to optimize in those areas we might end up splitting the code for client and server. I want to avoid that.

@1iveowl Thanks for your pull request. I will have a look and check where I can avoid async/await.

@JanEggers Please try to split your PRs. At least performance related changes and real bugs.

Best regards Christian

JanEggers commented 7 years ago

details: i use the buildin profiler of vs not really showing that much other than time is spend in read and write methods.

im currently playing around and got message transmission from 2700 to 8000 msg/sec by buffering on receiving end. i will also have a look at the sending side. i will disect that branch into multiple prs if i feel confident about it.

JanEggers commented 7 years ago

what are your feelings about Microsoft.Extensions.Logging? i thought about integrating it to be able to fine tune what log messages should be shown. and also disable string formatting of messages that are not shown. currently string.format climbed up to 10% cpu usage

chkr1011 commented 7 years ago

I hear from it but I have no experience. But in general I understand the point. In my opinion it would be enough to move the string.format to the MqttTrace class. If no one is listening the format is useless. Also the string interpolation can be converted to regular string.format versions. This would improve it as well.

What do you think about it? I can implement this change to logging.

chkr1011 commented 7 years ago

Please have a look at the develop branch. I was able to remove lots of useless async/await and I changed all trace messages to avoid useless context creation for string interpolation and useless string formats. Tomorrow I will try to identify more areas where async/await can be removed safely.

@1iveowl I had to remove lots of your ConfigureAwait because at most places the async/await makes sense by putting back the thread to the pool while waiting for data from a transport connectione etc. But there are still places where it makes sense.

JanEggers commented 7 years ago

concerning logging: i would have switched the loggiing api to take FuncOfString. that whay you still could still use stringinterpolation (if logging is enabled writing the content to any output is far more perf than formatting the entry). another idea is to just capture the MqttTraceMessagePublishedEventArgs in publish but do the event invoke in a task.run that way consuming the log eg console/file write does not block the internal processing

chkr1011 commented 7 years ago

Thats a suitable idea but for now I will leave it in the way it is. But for a next big update I concider changing the logging stuff. But for now I assume that the performane will not be affected by logging anymore. On the other hand the logging is always on in my other applications but I am only processing info/warning/error messages. If I revert to string interpolation etc. the overhead is there again.

I have doubts starting a new task per every trace message because depending on client count lots of those messages are generated and written. I am afraid that this will drop the performance. In my opinion the user of the library should do this in a proper way. It is possible to start a thread from the event or writing to another collection and a single worker thread processes those messages. I think this considerations should be done by the user.

JanEggers commented 7 years ago

@chkr1011 fyi i found various places with the pattern:

            try
            {
                return DoStuffAsync();
            }
            catch (Exception exception)
            {
                throw new MqttCommunicationException(exception);
            }

that code is broken, if you return the task without awaiting it the catch will never catch anything as the exception will be thrown where the returned task is awaited

chkr1011 commented 7 years ago

A yes you are right. I will merge your performance pull request and then I will review those places...

chkr1011 commented 7 years ago

@JanEggers I have a problem migrating your changes for UWP. There is no "Peek" method available for the socket (Available property). So I cannot peek correctly.

I saw you used the following code for the WebSocketChannel for NetStandard:

        public int Peek()
        {
            return WebSocketBufferSize;
        }

How can this work and how can I adapt it for the UWP implementation?

JanEggers commented 7 years ago

@chkr1011 sry about that i didnt actually test websocket implementation. i will test my next buffering pr. that works with Streams all the way and will be much easier to port

chkr1011 commented 7 years ago

OK thanks. I will have a look. I also tested the performance app and I refactored it a little bit. Now I can see that 1000 messages are processed in about 100 ms. The calculated duration per single message is now at about 0.2 ms. So your requirement of 50.000 messages per second should be fulfilled :)

chkr1011 commented 7 years ago

I just merged your changes. Good job! And it is much faster now. I am going through your changes and refactoring a little bit. Mostly my Resharper style settils will reformat things a little bit. But here are some findings:

In MqttPacketDispatcher there is the Reset method but _packetByResponseType is not clear. I assume it should. I don't want to change it before yo ucan confirm that my assumption is correct.
```
public void Reset()
{
    lock (_syncRoot)
    {
        _receivedPackets.Clear();
    }

    _packetByIdentifier.Clear();
}
```
BufferConstants In my opinion this can be moved to client and server options. Because then the user can choose proper values for this. What do you think?
In TaskExtensions I have a warning from Resharper because the task.ContinueWith is not awaited. As far as I understand this is OK but then it should be disabled (the warning) or moved to a task? If this should be stay in this way (not with a task) I can ignore that via a comment.
Did you test the client with a different server than the one from this project? I have no one but I want to know if it works with a different one. The same for the WebSocket implementation. I don't know a server based on WebSockets.
UnitTest "MqttServer_Unsubscribe" is failing. Do you maybe know why? I don't have time yet to investigate in detail. In two days I can also have a look.

But I must say. Very good work! Performance is awesome now! I am now trying to migrate the new interfaces for the UWP implementations...

JanEggers commented 7 years ago

In MqttPacketDispatcher there is the Reset method but _packetByResponseType is not clear. I assume it should.

you are right i forgot that.

BufferConstants

i agree too, i just wanted to move it away from the platform specific libraries.

in TaskExtensions I have a warning from Resharper because the task.ContinueWith is not awaited.

yup that may not be awaited otherwise it will deadlock so disabling the warning is ok

Did you test the client with a different server

no i didnt but i guess azure iot hub would fit as alternative. i didnt use that one either https://github.com/mqtt/mqtt.github.io/wiki/servers make shure you get listed there as well

UnitTest "MqttServer_Unsubscribe" is failing.

i dont know yet but i will have a look.

Resharper

i use resharper too can you commit your settings as TeamShared. i tried to use your formatting but maybe i missed some options / i didnt want to change my default settings as they are applied to all the other work

Performance is awesome now!

thx i will continue too optimize but changes should be more isolated now.

JanEggers commented 7 years ago

im thinking of making a library on top of this one that can be used as signalr transport so i can use signalr programming model and scaleout to handle more messages / failover loadbalancer etc... but that will have to wait until signalr leaves alpha.

JanEggers commented 7 years ago

pr incoming with fixed test problem is that client and server both generated packageid 2, package identifer should be a guid to prevent that but i dont know where to put a change request for the protocol so i use a dictionaryof dictionary for lookup

chkr1011 commented 7 years ago

If you build your signalR library please let me know. I want to add it do the readme then (if you agree).

After adding this lib to the page you mentioned my account was "flagged". But now it works again. :-)

I will review the PR and merge it later today. Also the changes for the questions I mentioned above. The packet identifier is fixed to a short value. This is by protocol design and it is allowed that the ID is used multiple times (but it must be always the same for several chains of packets). So we have to deal with it.

I have no idea how to share the ReSharper settings. I have no special settings there. Just the default ones. So I have no config file!?

JanEggers commented 7 years ago

i put some more perf in it. when using a single client im at 150K msg/sec, that is ok for now. but there is some issue with multiple clients if i crank the testapp to more then 4 clients i get task cancellation exceptions. didnt figure out why yet.

chkr1011 commented 7 years ago

Where do you have the exception? At client or server? Maybe the errors is a server side!?

I have some problems with the "raw" stream. I cannot create it for UWP apps because there is either a stream for read or one for write. Also the BufferedStream is not available there.

I assume that I can assign the ReadStream as the RawStream because it is only used for reading if a timeout is set. Is that assumption correct? And I would like to rename it to "RawReceiveStream" because it makes more sense for me.

JanEggers commented 7 years ago

yes you can assign it directly, thats what i did in .net standard and for the websockets. For the renaming, I'm fine with that but i would add yet another stream property RawSendStream just for parity. My prefernece would be that the ICommunicationChannels just have the RawStreamProperties and buffering is done in the platform independent core library. but then we have to use a handmade implementation as the .net framework buffered stream is not in .net standard.

JanEggers commented 7 years ago

i did some research and BufferedStream is included in .net Standard 1.5 so we could improve if we bump up to .net standard 1.5, but i dont know if that is ok / what are the consequences for UWP

chkr1011 commented 7 years ago

I refactored some minor things and pushed them. Please pull develop branch.

I removed the ReceiveBuffer because in the end a new memory stream is created always.

OK lets rename both but we can call them "SendStream" and "ReceiveStream" and add buffering on top of hat in the core library. "Raw" has no meaning then.

I am also thinking about updating to .NET Core 2.0 because there is WebSocketsSecure available @nowakpiotr mentioned that. But it is fairly new...

Before upgrading lets fix one problem. I use the UWP test app and the server in this project. It works fine but after several seconds I get a strange error. I assume it is related to the new dispatcher. The problem seems to be that the cancellation exception is thrown even if the package arrives. You can see it here:

You can see that Ping/Pong does work a couple of times. Then the response message arrives but it is not dispatched (I added this new exception).

I checked the code but to me it seems to be OK. Do you habe any idea? Beside from that it looks good. I can also connect to Azure IoTHub with a secure channel.

chkr1011 commented 7 years ago

@JanEggers I found this article (https://blogs.msdn.microsoft.com/dotnet/2017/08/25/uwp-net-standard-2-0-preview/) saying that UWP is .netstandard 1.6 but there is a preview. I would like to wait until it is released and published for a certain time. Otherwise the library is not usable for old installations.

But we can stay at the two kinds of Streams for now. Then we can take advantage of buffering (but not on UWP) and keep support for older systems (It is not "old" because the 2.0 is only a preview).

Then we must also wait for WebSocketSecure protocol...

rw-nowakpiotr commented 7 years ago

I tested WSS with .net core 2.0 on raspberry pi and such a solution works well.

chkr1011 commented 7 years ago

But you run Linux or Windows IoT Core?

rw-nowakpiotr commented 7 years ago

Linux

chkr1011 commented 7 years ago

OK then it works. For UWP 2.0 is not available yet.

chkr1011 commented 7 years ago

@JanEggers I have some more problems with the TimeoutAfter method. I am awaiting it but the code continues without waiting. This breaks the communication. Maybe this is related to my mentioned error above!?

JanEggers commented 7 years ago

hi there i fixed the bug. timeout after is fine. i didnt commit changes in mqttclient.

in this place https://github.com/chkr1011/MQTTnet/blob/eda8bff10be341578736fc9405617350f7a92813/MQTTnet.Core/Client/MqttClient.cs#L339

it is crucial to begin to wait first and then send the packet otherwise the server may have send the response before the awaiter is set up.

JanEggers commented 7 years ago

I am awaiting it but the code continues without waiting.

it should work if your findings are true the unittests i added for the timeoutafter must fail. if you can repro add another test and ill make it green.

JanEggers commented 7 years ago

i downloaded uwp sdk but im not able to start the app :( i just get a popup: the app wasnt started.

JanEggers commented 7 years ago

this was what i was referring to btw:

https://github.com/aspnet/KestrelHttpServer/issues/1980

chkr1011 commented 7 years ago

@JanEggers Yes the code is awaiting.. I am not able to create a unit test which fails but I am not able to connect with the WebSocket test server from mosquitto (ws://test.mosquitto.org/mqtt).

Maybe you can try it. More info about the server: https://test.mosquitto.org/

The problem is that the code stays as "ConnectAsync" from the WebSocket channel but goes ahead and thus the stream is null. When I remove the "TimeoutAfter" at the "ConnectAsync" then the Connect stays forever. I don't know if the server address is wrong or if it is related to a UWP app with a UI thread etc!?

I will still try to find the issue...

chkr1011 commented 7 years ago

@JanEggers I tested the WebSocket channel and it works without problems.

But there is one issue left. Maybe you have some ideas: If you use a random value for the server which does not exist like "abc" the code throws a nullref execption instead of a socket exception. This is really bizzare.

Can you have a look as well? Maybe you have more luck than me :-)

But please pull develop branch before.

chkr1011 commented 7 years ago

@JanEggers I was finally able to locate the issue in TimeoutAfter. I also added a new Unit Test for that case but I ended up implementing the method in a different way. Your original implementation is still there but commented out. Do you want to have a look?

I also tested the library with both TCP and WebSocket connections and everything works fine. If you have no additions I will release it soon. Please let me know.

Best regards Christian

JanEggers commented 7 years ago

Hi Christian,

the new implementation looks like the original I tried to replaced. So it will use much memory for pending delay tasks. (they are not canceled in case the real work is done). il try to fix my implementation.

chkr1011 commented 7 years ago

I just saw that I broke it again. Please pull develop to get the working version.

Isn't it enough to add the cancellation toklen like in your implementation to the delay task and cancel it if the other task has finished first?

JanEggers commented 7 years ago

i created a pr, implemented like u adviced, that code is simpler than what i did. i also added a test and resharper settings. if you want to update them to match yours you can do it in resharper options. just save them as team shared

JanEggers commented 7 years ago

regarding test regression: maybe you should look into ci build, there should be free options in github marketplace.

chkr1011 commented 7 years ago

I played around with travis for this project but I was not able to build it yet. So I disabled that integration. But I will give it a next try soon. Maybe after the next release. Or are you familiar with the CI system? I already added a travis.yml but It does not work.

chkr1011 commented 7 years ago

@JanEggers I completed all my tests with client and server and I am ready to release a new version. If you agree and there is nothing open from your side please close this ticket. Then I know that there is nothing more.

Thank you for your big contribution. It is a massive improvement for this project. I would appreciate if I see more PRs from you in the future.

Best regards Christian

JanEggers commented 7 years ago

@chkr1011 : go ahaid. if there is more ill just create a new issue / pr

dotnet / MQTTnet

Bad Performance #31