Open KelvonLi opened 3 years ago
Hi @KelvonLi,
This example md5_mb_over_4GB_test.c
is not meant as a performance test and in fact the multi-buffer part does a lot more work then the single-buffer check. It is processing TEST_BUFFS
x the data than the single buffer by doing multiple jobs. At the end you may notice that it checks the multiple final digests created in the multi-buffer part against the one single buffer result as a check. I suggest you start with one of the included performance tests instead.
Hi @gbtucker ,
Thanks a lot for your reply. I'm testing with one single buffer and also multiple buffer. Here are some simple questions to ask:
Are the md5_ctx_mgr_flush/md5_ctx_mgr_submit apis thread safe?
To get the final md5 value of multiple buffers as one logic single buffer, does it have to use one single ctx and submit buffer one by one, right? I didn't find a way to leverage multiple ctxs(lanes) to calculate parallelly and generate one final md5 value.
My latest understanding is that, each ctx(lane) could only be used to calculate md5 at one moment and it should NOT be used until completed. Multiple ctxs(lanes) could run parallelly for each different md5 calculation. Please correct me if I'm wrong. Thanks a lot.
Hi @KelvonLi,
For 1. all the functions are thread safe and reentrant. I would suggest one ctx per thread and take a look at the examples in examples/saturation_test for how to do this.
For 2. the lanes must have independent hash jobs to run in parallel. Because these are cryptographic hashes, there is no way to break up one hash job and run pieces concurrently beyond the fundamental block size.
Hi @gbtucker, Thanks a ton for your replies and sharing! I'll have some further study and test.
Hi,
I'm trying md5_mb performance to figure out if it also perform much better than open ssl when running with many multiple buffers.
And I changed the test code as below and built it and had a test. It turned out that the performance was worse than open ssl, on both of test CPU platforms. Not sure if you had similar test, is it expected? And how should I improve its performance? Thanks a lot!
#Test result: /workspace/isa-l_crypto/tests/extended # ./md5_mb_over_4GB_test md5_large_test md5_openssl: runtime = 22236247 usecs, bandwidth 8 MB in 22.2362 sec = 0.38 MB/s Starting updates md5_ctx_mgr: runtime = 52901056 usecs, bandwidth 8 MB in 52.9011 sec = 0.16 MB/s
# Test code change /workspace/isa-l_crypto/tests/extended # git diff md5_mb_over_4GB_test.c
include "md5_mb.h"
include "endian_helper.h"
include <openssl/md5.h>
+#include "test.h" +
define TEST_LEN (1024*1024ull) //1M
define TEST_BUFS MD5_MIN_LANES
+//#define TEST_BUFS MD5_MAX_LANES
define ROTATION_TIMES 10000 //total length processing = TEST_LEN * ROTATION_TIMES
define UPDATE_SIZE (13*MD5_BLOCK_SIZE)
define LEN_TOTAL (TEST_LEN * ROTATION_TIMES)
@@ -54,6 +57,7 @@ int main(void) uint32_t i, j, k, fail = 0; unsigned char *bufs[TEST_BUFS]; struct user_data udata[TEST_BUFS];
struct perf start, stop;
@@ -72,11 +76,17 @@ int main(void) }
perf_print(stop, start, (long long)TEST_LEN TEST_BUFS 1);
@@ -86,6 +96,7 @@ int main(void) }
lines 6-62/62 (END)