Add 'Start from line', 'Step by' parameters to CSV Dataset Config, also enabling it to read random lines

asfimport commented 9 years ago

Flavio Cysne (Bug 56961): These parameters can help to better control test data. For distributed environments these parameters will be more valuable, since they can avoid the need to split CSV data file across JMeter slaves.

'Start from line' defines from what line, in a CSV Data file, JMeter will start reading when the first thread requires it. Defaults to 1 (one). If set to 0 or below it will start reading from a random line.

'Step by' defines how many lines will be skipped before reading from CSV file. For example, if 'Step by' is set to 5 then JMeter CSV reader will read lines 1, 6, 11 and so on. If this parameter is set to 0 (zero) no further lines will be read.

'Read random lines' defines if JMeter will read lines sequentially, if false (default), from CSV file or randomly, if set to true. This parameter would be a combination of the two parameters above. If 'Start from line' is set to 0 or less then JMeter CSV reader will read a random line from CSV file for its first thread. If 'Step by' is set to -1 (or any negative number, to prevent errors) then a random line will be read from CSV file every time it is required to.

OS: All

asfimport commented 9 years ago

Sebb (migrated from Bugzilla): (In reply to Flavio Cysne from comment 0)

'Read random lines' defines if JMeter will read lines sequentially, if false (default), from CSV file or randomly, if set to true. This parameter would be a combination of the two parameters above. If 'Start from line' is set to 0 or less then JMeter CSV reader will read a random line from CSV file for its first thread. If 'Step by' is set to -1 (or any negative number, to prevent errors) then a random line will be read from CSV file every time it is required to.

This is going to be expensive to implement, as it's not possible to skip to the start of a random line within a file. It's only possible to skip to a random byte in general. Far better to randomise the file before starting the test.

asfimport commented 9 years ago

Sebb (migrated from Bugzilla): (In reply to Flavio Cysne from comment 0)

These parameters can help to better control test data. For distributed environments these parameters will be more valuable, since they can avoid the need to split CSV data file across JMeter slaves.

However the slaves will still need to be configured with different starting lines. This means each slave will need a different config anyway.

asfimport commented 9 years ago

Flavio Cysne (migrated from Bugzilla): Yes, each slave will have to have its own '-J' parameter set to distinguish from others.

'Step by' parameter is hardly needed to be different among JMeter slaves, but it'll be a choice for the tester.

I'm thinking about how to ease the process of distributing CSV files across the distributed environment without the need to split/randomize the main CSV file in as many files as the number of slaves used.

With this I could have many JMeter slaves within the same machine and even they read the same file, lines read would be distinct ones. Even if slaves were in different machines, I could have then reading the same file (pointing to the same file or with the same content) and each thread of each slave would get a different set of values from it.

The randomness is a condition that makes JMeter reads a random number of times before really assigning the values to variables.

asfimport commented 9 years ago

Sebb (migrated from Bugzilla): (In reply to Flavio Cysne from comment 3)

The randomness is a condition that makes JMeter reads a random number of times before really assigning the values to variables.

The user would have to provide the maximum random number or the code would have to read the file once to determine how many lines there were. Unless you just mean it should skip a random number of lines from current position, though there is still the issue of the maximum possible skip value.

Though I suppose it might be possible to seek to the random byte offset in the file and then find the next line boundary (not 100% sure that's always possible with multi-byte characters). If the records varied much in length this would not give a very even distribution; it would favour the next record after long records.

I don't see a use case for this that cannot be solved much more simply by randomising the input before the test starts.

apache / jmeter

Add 'Start from line', 'Step by' parameters to CSV Dataset Config, also enabling it to read random lines #3429