apache / jmeter

Apache JMeter open-source load testing tool for analyzing and measuring the performance of a variety of services
https://jmeter.apache.org/
Apache License 2.0
8.3k stars 2.09k forks source link

New listeners optimized to work with huge amount of data and hight throughput and optimized DistributionGraph #2279

Closed asfimport closed 14 years ago

asfimport commented 15 years ago

Jakub (Bug 47865): Package contains two listener with functionality similar to Aggeregate Raport but designed to work perfectly with huge amount of data (millions samples) and hight throughput (hundreds per seconds) and optimized DistributionGraph. Code requires at least java 5.0.

Details:

Created attachment jmeterListeners.zip: Two new listeners (similar to AgreggateReport but much faster) and optimized version of DistributionGraph

Votes in Bugzilla: 1 OS: Windows XP

asfimport commented 15 years ago

Jakub (migrated from Bugzilla): JUnit tests for StatCalculator, UpgradedStatCalculator and BacketStatCalculator

Created attachment junit.zip: JUnits test for statistics calculators

asfimport commented 15 years ago

Sebb (migrated from Bugzilla): As far as I can tell, the BacketStatCalculator assumes that all Numbers are positive and no larger than Integer.MAX_VALUE - is that correct?

asfimport commented 15 years ago

Jakub (migrated from Bugzilla): Yes it's true. If this is problem HashMap<Number, Long> can be used instead of array as a internal container for data but performance will be lower so I chose array because performance was my target. If you think that this listener (or DistributionGraph) should be used in long (even days) stress test with high throughput then you can assume that response time is very low probably less than 1s so this listener will work correctly and efficiently. BacketStatCalculator is optimized to work as a data model for DistributionGraph and doesn't slow down the test because it gives information about number of occurrences of each sample (value) without any extra calculation.

BTW. I created all those listeners because when I was testing my application after 2 minutes when jmeter collected about 100 000 samples throughput was started fall from 800 req/s to 300 req/s after several minutes. At the beginning I'd thought it was a problem with my application but then I realized that when I cleared collected samples in jmeter throughput was again about 800 req/s (for about 2 min ;)). Aggregate report was the bottle neck but 90% line is very useful and I need it (extra I need 99% and 99.9% line) so I created UpgradedAggregateReport that can calculate some approximation of 90, 99 and 99.9% line. I realized that I can calculate exactly value of those lines and I created BacketStatCalculator (performance can be a bit lower than UpgradedStatCalculator).

Just to summarize before I made any optimization, jmeter with 30 thread could generate about 2000 req/s to my application (after few minutes throughput was dropping) after I made some optimization in listeners, integration with javascript, used collections and thread synchronization in jmeter code, jmeter can generate about 7500 req/s. Nowadays I can make tests that take 48h or more and collect 600 000 000 samples and I can save 3 server (currently 1 machine can generate load equals to 4 machine previously).

asfimport commented 15 years ago

Sebb (migrated from Bugzilla): Thanks. Given that these calculations are all done on response time (milliseconds), I agree that the limitation is not likely to cause a problem.

BTW, I just realised that one could save memory at the expense of some accuracy by using centiseconds instead of milliseconds (i.e. divide the response time by 10 before storing in the bucket). One could even scale the divisor to keep the rounding roughly proportional to the value, but there would be the extra cost of checking the value.

asfimport commented 15 years ago

Jakub (migrated from Bugzilla): I think memory is not a problem and divide the response time by 10 will decrease performance (better use 8 or 16 then divide will be faster). In my case it is important to know if the response time was 1ms or 9ms, 10ms or 19ms - when centiseconds will be use all response from 0-9ms go to first bucket and 10-19ms go to second, details information will be lost, so I prefer to have 1ms backets. In regard to the memory lets assume that the longest response time is 1 min (I don't know who will be wait so long for response but maybe there are patient people;)), so the backet array will occupied: 1 60 1000 * 8 = 480 000 = 468 kb so less than 0.5 MB. One important thing to noticed is that this array will not grow during the test no matter how many samples are collected, so there is no different if there are 1 000 or 10 000 000 samples collected if all are below 1 min. Another important thing is that because this is an array there is fast access to each element, so counters increment in backets is fast.

But maybe it will be useful to have an option in gui to set the backet size, one will choose 1ms (like me) and others maybe 8 or 16ms.

asfimport commented 14 years ago

Sebb (migrated from Bugzilla): See https://github.com/apache/jmeter/issues/2314 - Improve StatCalculator performance and the changes in http://svn.apache.org/viewvc?rev=891076&view=rev

These have vastly improved the performance of Aggregate Report and many other Visualizers.

Having made those changes, I now realise that using TreeMap to store aggregated response times is a similar solution to the bucket approach. It has the advantage that the number of buckets does not have to be specified in advance.

So I think the changes proposed here are probably no longer necessary.