softwaremill / mqperf

https://softwaremill.com/mqperf/
Apache License 2.0
145 stars 37 forks source link

ArtemisMQ #39

Closed michaelandrepearce closed 7 years ago

michaelandrepearce commented 7 years ago

Jms only need to ack last message, dont need to ack each one individually, all messages upto there will be ack'd.

michaelandrepearce commented 7 years ago

@adamw after going through the code, i notice the jms ack you ack each message, you only need to actually ack the last, as per JMS spec this will mean all messages upto that one will be ack'd. Have changed the jms code to this. This should improve things a little.

softwaremill-ci commented 7 years ago

Can one of the admins verify this patch?

adamw commented 7 years ago

So no need to set the non-blocking acks?

I won't manage to run the tests today, hopefully early next week. Thanks for the patch!

michaelandrepearce commented 7 years ago

@adamw just this patch for now. I don't want to make too many changes in one go, as obviously its very difficult to understand without having your environment to actually repeat (i don't have an AWS account nor Ansible so cannot just spin up myself).

The core issue is the consumer not consuming fast enough, ideally the true solution here would to actually turn on auto acknowledge (with dups ok) and set a message listener, ack auto occurs after the completion of onMessage as such gives you the semantics you but better perf, where the unit of work is done inside the message listener, as truly async but this would be a bit of a re-write of your code.

adamw commented 7 years ago

Ah, this helped a lot :) Now with 4 nodes, 5 threads I'm getting about 51k msgs/second both sent & received (https://snapshot.raintank.io/dashboard/snapshot/loA90B4aYPe5KcagVMea5wzltNP0lDoV). That's with journal-datasync=false, memory mapped, transactional sends.

Using non-transactional sends brings the performance down to 15k msgs/s. As I understand, the commit only returns once the data has been stored and replicated on both nodes?

Interestingly, with journal-datasync=true, I get the same result. I'll post a correction for the article tomorrow, thanks for the input!

michaelandrepearce commented 7 years ago

Yes on commit return everything = your required replication and disk write needs.

michaelandrepearce commented 7 years ago

Yes because we have datasync turned off at three different points

michaelandrepearce commented 7 years ago

(But it's best to ensure all three are off)

clebertsuconic commented 7 years ago

@adamw: It would be nice to update with artemis 2.2.0 (released today).. (updating the website now).

michaelandrepearce commented 7 years ago

@adamw just one other thing as now such good result, not sure what the result will be but can you maybe try a 6 and 8 client node setup, and maybe also a 25thread one also similar to the Kafka test

michaelandrepearce commented 7 years ago

@adamw did you do an update?

adamw commented 7 years ago

Yes, pushed a couple of minutes ago. Was about to write you that they are up :)

adamw commented 7 years ago

And it's there: https://softwaremill.com/mqperf/ Hope you'll be satisfied :)

Thanks for the configuration improvements!

clebertsuconic commented 7 years ago

👍

clebertsuconic commented 7 years ago

did you have a chance to use Artemis 2.2.0?

adamw commented 7 years ago

yes, these are using 2.2.0

michaelandrepearce commented 7 years ago

@adamw great to see this is good. Lets leave it there. (obviously we can always tweek more out if needed)

One thing to mention is artemis can do multi master if needed to scale horizontally, if you are actually looking to use artemis in your own real envs I'm not sure if you're just doing it for benchmark or looking to use it for real. If you are and need further details on how to do this setup do feel free to shout I'm sure we can help. Very much agreed this needs better documentation / easier setup, ill raise this back in the artemis community mail lists as some feedback.

michaelandrepearce commented 7 years ago

@adamw just reading through the summary texts, one bit looks like it may need a small re-wrtie.

Finally, the processing latency has a wider distribution across the brokers. Usually, it's below 150ms - with Kafka, Artemis, ActiveMQ and SQS faring worse under high load:

I think this should be updated as this isn't for Artemis case since retest.

And like wise i think this needs a bit of a rephrase

When looking only at the throughput, Kafka is a clear winner (unless we include SQS with multiple nodes, but as mentioned, that would be unfair):

adamw commented 7 years ago

@michaelandrepearce already fixed, forgot to update that part :)

We're both doing a benchmark and also evaluating solutions for future use. Thanks for the offer - we might use it some day :). Definitely would be great to clarify the docs a bit, sometimes I get lost there :)