cisco / libsrtp

Library for SRTP (Secure Realtime Transport Protocol)
Other
1.22k stars 475 forks source link

Add x86 SIMD optimizations to crypto datatypes #507

Closed Lastique closed 1 year ago

Lastique commented 4 years ago

The SIMD code uses intrinsics, which are available on all modern compilers. For MSVC, config_in_cmake.h is modified to define gcc/clang-style SSE macros based on MSVC predefined macros. We enable all SSE versions when it indicates that AVX is enabled. SSE2 is always enabled for x86-64 or for x86 when SSE2 FP math is enabled.

pabuhler commented 4 years ago

Hi, @Lastique thanks for these.

I guess these changes are based on real world usage, do you have any benchmarks on the speed improvements for a given configuration & platform?

Lastique commented 4 years ago

I have attached a unit benchmark to https://github.com/cisco/libsrtp/pull/508, but I don't have one at hand for the changes in this PR. I did bench the change when I originally wrote this patch for our in-house use a few years back, but unfortunately I didn't save it. I suppose, I could write one anew, if required.

Basically, these two PRs came out of our attempts to reduce libsrtp entries in the profiling reports for our WebRTC SFU software. The software maintains lots of SRTP connections (hundreds), and these changes allowed to save a few percents of total CPU usage. The most CPU consuming part in this PR that came up in our reports is bitvector_left_shift, the other places are mostly where I saw an easy opportunity to optimize.

Lastique commented 2 years ago

Rebased on top of the current master.

pabuhler commented 1 year ago

@Lastique thanks for updating, will take a new look at this now

pabuhler commented 1 year ago

thanks for the pr @Lastique , and sorry it so long to get in

Lastique commented 1 year ago

Thanks.