Read computes bandwidth using amount written, not read

We have the following statement / question with regard to fs_test. Can you clarify for us?

Our benchmarker’s discovered what they believe is a bug while validating our benchmark results using fs_test. Can you let us know if this is a bug and how you would want to handle this situation.

We noticed this comment at the start of fs_test.c: TODO: read should be affected by time limit also pass min and max blocks when time limit used

Looking further, we noticed there is a “compute_amount_written” function, but no corresponding function to compute the amount read. It appears that all the bandwidth numbers printed in the read output section are actually based on the amount of data written.

This is only accurate if all the data that was written gets read. The command line options specified in the RFP cause fs_test to write and read for the same amount of time. For a variety of reasons, the read test may hit the time limit before all the data that was written gets read. In that case, the actual read rate would be slower than the write rate, but fs_test will report the Effective Bandwidth of the two tests to be essentially the same since the max write and max read times will be nearly identical.

Is this the intended behavior, or should fs_test compute the actual amount of data read during a fixed time test in order to print the accurate read bandwidth?

Yeah. I just looked at the code. In time limit mode, there is a bug where if the reads exceed the time limit, then they will be cut off, but the bandwidth computed is based on the larger amount of data that was written. Typically this is unlikely to occur since writes are typically slower than reads. But it's a bug. Two ways I can think to fix:

Change is_exit_condition to only use the num_objs parameters during reads when time limit was set. This will make sure that the bandwidth computed is correct but may allow fs_test to run for longer than the user wants if the reads are really slow.
Change state->objs_written to state->objs_traversed and then update it on reads as it is updated on writes.

Two other things:

Can whoever makes this change please remove the ugly empty else statement at fs_test.c:2207?
The link at http://institute.lanl.gov/data/software/ points to the old sourceforge page for fs_test which is a 404

Thanks,

John

On 7/18/13, Brett Kettering notifications@github.com wrote:

We have the following statement / question with regard to fs_test. Can you clarify for us?

Our benchmarker’s discovered what they believe is a bug while validating our benchmark results using fs_test. Can you let us know if this is a bug and how you would want to handle this situation.

We noticed this comment at the start of fs_test.c: TODO: read should be affected by time limit also pass min and max blocks when time limit used

Looking further, we noticed there is a “compute_amount_written” function, but no corresponding function to compute the amount read. It appears that all the bandwidth numbers printed in the read output section are actually based on the amount of data written.

This is only accurate if all the data that was written gets read. The command line options specified in the RFP cause fs_test to write and read for the same amount of time. For a variety of reasons, the read test may hit the time limit before all the data that was written gets read. In that case, the actual read rate would be slower than the write rate, but fs_test will report the Effective Bandwidth of the two tests to be essentially the same since the max write and max read times will be nearly identical.

Is this the intended behavior, or should fs_test compute the actual amount of data read during a fixed time test in order to print the accurate read bandwidth?

Reply to this email directly or view it on GitHub: https://github.com/fs-test/fs_test/issues/3

fs-test / fs_test

Read computes bandwidth using amount written, not read #3