apache / jmeter

Apache JMeter open-source load testing tool for analyzing and measuring the performance of a variety of services
https://jmeter.apache.org/
Apache License 2.0
8.25k stars 2.09k forks source link

JDBC sampler : Add hashing of Data to avoid storing all output into memory when result is arbitrarily large #1896

Open asfimport opened 17 years ago

asfimport commented 17 years ago

Nathan Bryant (Bug 41921): JDBCSampler (and I presume other samplers) store all the output received from their test action. example:

Data data = getDataFromResultSet(rs);
res.setResponseData(data.toString().getBytes());

This is poor software design because the data could be arbitrarily large and fill memory. It is causing OutOfMemoryErrors for us, even with not very many threads. This is major or even critical because it prevents JMeter from being used to generate significant load. All samplers should be rewritten to just build an MD5 hash iteratively. The hash should be updated one buffer or row at a time instead of in bulk.

OS: All

asfimport commented 17 years ago

Sebb (migrated from Bugzilla): The full sample results are needed for some purposes - e.g. the Tree View Listener can display the results of an HTTP Sample, and Assertions need the response to be present - so it would not make sense to always throw away the response data.

And the data needs to be retrieved, otherwise the sample time will not be representative.

However sometimes it is not ideal to store all the response data.

As a work-round you could perhaps do one of the following:

As to how to fix this: there could be an option to limit the size of the stored data. That should be fairly easy to do.

asfimport commented 17 years ago

Nathan Bryant (migrated from Bugzilla): An MD5 or similar hash would be preferable over just storing part of the data, for people who are using the data for functional testing. Then they could compare everything for identity at least. I'm not doing functional testing so I don't care, but I would recommend adding a configuration checkbox for an MD5 mode.

asfimport commented 16 years ago

Sebb (migrated from Bugzilla): I've been looking into how to add hashing to the JDBC sampler.

It would be easy enough to collect all the response data and convert it to a hash just before storing it. Would that be enough for your tests? The disadvantage of this approach is that JMeter would need enough memory to store the whole response - but at least it would be only temporary.

A better solution would be to hash the data as it is retrieved. However this is not particularly easy to do, as the data is all fetched and then formatted into lines and columns.

Also, is it important that the hash is the same as the one that would be obtained by hashing the result data after download? Or does it just need to contain all the response data in a predictable order? This would be easier to do, as there would be no need for the second formatting stage.

Any other suggestions for how to process the JDBC data are welcome...

asfimport commented 15 years ago

Gregg (migrated from Bugzilla): For my own curiosity, what magnitude of data is being dealt with here? Are we talking hundreds of megabytes? Gigabytes? Tens or hundreds of gigabytes? The reason I ask is because my first thought was to simply have the user increase the maximum heap size of the JVM. What is the user currently using as the maximum heap size?

asfimport commented 12 years ago

@pmouawad (migrated from Bugzilla): Still missing in 2.5.1

asfimport commented 12 years ago

Evan M (migrated from Bugzilla): Gregg: Increasing the JVM memory does not help. The order of magnitude is gigabytes of data for me, but it doesn't really matter, because the application just ramps up memory until it runs out. I should be able to run a test for an arbitrarily long amount of time if I don't need to store the result data.

For my use case, I want to test the maximum throughput of a large select statement from my webserver to my database, but the application caps out its memory before I can get any useful data. If I don't have any listeners that need the response data, it should not be cached.

I am having this issue running 2.7 r1342410 on Windows Server 2008.

asfimport commented 5 years ago

Franz Schwab (migrated from Bugzilla): How about enhancing the JDBC sampler to discard a certain amount of rows?

I think about enhancing it the following way: https://stackoverflow.com/questions/43901408/jmeter-jdbc-sampler-fails-on-large-resultset

Any feedback on this before I start working on it?

asfimport commented 5 years ago

@pmouawad (migrated from Bugzilla): Hi Frank, Thanks for contributing.

What is your use case ? The SO answer seems to fetch only first row right ?

Regards

asfimport commented 5 years ago

Franz Schwab (migrated from Bugzilla): Hi Philippe! Thank you for your quick response.

My use case is database load testing. In 99% of the cases I use the JDBC sampler for, I am only interested in the time it took the database to run the query. I am not interested in the time it took the client (JMeter in that case) to fetch the result set. A BI client for example might run a query with a big result set, but maybe only fetch the first 100 rows and not the whole result set. Currently, there is no option in JMeter to do so. Even when you set the "Count Records" option in JMeter, the whole result set is fetched (in order to count the rows). There is no option to get the result set size without fetching it (this is JDBC standard). It is not an option to add a LIMIT clause at the end of the query, as databases might have an optimization for that. For the same reason, it is also not an option to use the JDBC parameter ResultSet.setMaxRows(int). I am only interested in knowing that the query has been processed successfully (= didn't throw an error).

Yes, the code provided in the S.O. link only fetches one row.

Best regards, Franz

asfimport commented 5 years ago

Franz Schwab (migrated from Bugzilla): current work status from my side: https://github.com/frschwab/jmeter/commit/ca96394e0e4913f7b2407a6bcf7f843d92959310

still need to update documentation.

thanks for any comments!

asfimport commented 5 years ago

@pmouawad (migrated from Bugzilla): (In reply to Franz Schwab from comment 10)

current work status from my side: https://github.com/frschwab/jmeter/commit/ ca96394e0e4913f7b2407a6bcf7f843d92959310

still need to update documentation.

thanks for any comments!

Thanks for contribution. Would it be possible to also add JUnit testing code ?

Thanks

asfimport commented 5 years ago

Franz Schwab (migrated from Bugzilla): Yes - I can do that. Are all the tests written in groovy? I just had a quick look at the code.

Could you also have a look at this one, as nobody replied yet: https://github.com/apache/jmeter/issues/5059

Thanks for feedback!

Franz

asfimport commented 5 years ago

@pmouawad (migrated from Bugzilla): (In reply to Franz Schwab from comment 12)

Yes - I can do that. Are all the tests written in groovy? I just had a quick look at the code.

Could you also have a look at this one, as nobody replied yet: https://github.com/apache/jmeter/issues/5059

Thanks for feedback!

Franz

I have reviewed it and left a comment

asfimport commented 5 years ago

@pmouawad (migrated from Bugzilla): (In reply to Philippe Mouawad from comment 13)

(In reply to Franz Schwab from comment 12) > Yes - I can do that. Are all the tests written in groovy? > I just had a quick look at the code. > > Could you also have a look at this one, as nobody replied yet: > https://github.com/apache/jmeter/issues/5059 > > Thanks for feedback! > > Franz

I have reviewed it and left a comment

You can write test using Spock Framework + Groovy or JUnit 4.

asfimport commented 5 years ago

@pmouawad (migrated from Bugzilla): Hello Franz,

WIll you submit a PR ? Thanks

asfimport commented 5 years ago

Franz Schwab (migrated from Bugzilla): Hi Philippe,

yes I still want to contribute a pr. I just didn't find the time (yet) to write some basic tests. Until when would that be necessary to see the PR in the next release?

By the way - I just realise that the topic of this issue is about hashing the data (on client side), not about limiting the transfer of the result set from server to client (what I implemented). Should I create a new issue and continue there?

asfimport commented 5 years ago

@pmouawad (migrated from Bugzilla): (In reply to Franz Schwab from comment 16)

Hi Philippe,

yes I still want to contribute a pr. I just didn't find the time (yet) to write some basic tests. Until when would that be necessary to see the PR in the next release?

By the way - I just realise that the topic of this issue is about hashing the data (on client side), not about limiting the transfer of the result set from server to client (what I implemented). Should I create a new issue and continue there?

Thanks, yes please create another issue. Thanks

asfimport commented 5 years ago

Franz Schwab (migrated from Bugzilla): Ok, I created a new bug: https://github.com/apache/jmeter/issues/5118