Closed alexkalish closed 4 years ago
An overview of the is located in Confluence
The original test was conducted using JMeter. The pre-FIPS JMeter test is here: JMeter Script
When we discussed on the relevant performance testing needed for the OpenSSL change, there were two main things we were requested to focus on:
This is indeed translated into the tests specified inside DAP performance confluence.
In addition:
@rrefael: To be clear, we will only be testing Conjur OSS, so followers, UI and Synchronizer are all out of our current scope.
Note: These tests were run locally, it is possible other applications running in the background interfered with the stats
dap-intro tagged with 11.4
dap-intro tagged with 5.0-stable
@h-artzi What are Set BaseDate
and Set DateNow
?
@micahlee, they are both timestamps and when the time difference between the two is too large then the hosts reauthenticate with the DAP instance. I decided to keep this feature from the jmeter script attached to this ticket, however, it is most likely overkill for the current test.
Hi @alexkalish , As @rrefael was saying, the requirement are specified in the confluence page and should be follow as we discuss on them a while ago, it doesn't matter if we are testing conjur or DAP, we should understand that the performance is the same and we the load can be done. Running the same tests will help us with OpenSSL, if will find a gap in OpenSSL performance we will be easily able to understand if the degradation was part of Rails upgrade or something to do with OpenSSL changes.
In general, we should always aspire to run performance tests before release, especially after a big change like Rails.
@hilagross: Agreed that performance testing is needed! What I'm hearing is that you do not have a strong preference for testing OSS vs DAP, as long as we confirm no performance regressions. Additionally, I'm assuming that you have no objections/concerns with test details in the description. Are those statements correct? Thanks.
@alexkalish The rails upgrade is a feature that may hold performance degradations, which may occur in OSS and DAP. Since DAP enables more features and use cases than OSS, I think that the verifications in DAP should at least include verifying this delta. Not verifying DAP at all seems to me like a risk.
@rrefael, I understand your concern for introducing performance issues to DAP with the FIPS compliance work. When I took a close look at the Daily average build time for cyberark/conjur
, I noticed a slowdown when the Rails 5 code was merged. Below is the stage level view:
I'm going to open an issue to address the slowdown (which appears to fairly universally impact all the tests). We saw a very small slowdown in early runs of the jMeter load test (workflow captured above), but nothing like the slowdown we're seeing above.
As a note, we used the DAP appliance (11.4 vs latest stable). We also run a multi-day load test as part of the release process to find issues that result from long tasks. We'll
We're planning to continue expanding our load testing scope, but have very limited capacity (just a single engineer focused on finishing the Rails 5 work).
As a short term plan, let's use the data from Jenkins, in addition to the various load tests, to see if we're introducing any major performance issues with the FIPS work.
@h-artzi, to wrap this up, can you please create a Sharepoint spreadsheet with the Rails 4 and Rails 5 test results? Please calculate the differences between the metrics generated in each run. We want to understand percentage change.
@jvanderhoof @h-artzi: Was this just on a Hadar's laptop? Do we think that is a controlled enough environment? Also, our customers will be running on Linux. Could that OS difference have any impact on the results?
@alexkalish it is currently being run on my laptop and it is possible it is leading to some error. For example, there is a significant jump while loading one of the policies.
An overview of the our load test comparison:
Test case: Batch retrieval of four variables, executed 9200 times via JMeter. This was run on a developer laptop.
Results (comparison between pre-upgrade and post upgrade DAP master):
Average | Min | Max | 90th pct | 96 pct | 99 pct | |
---|---|---|---|---|---|---|
Batch Retrieve Secrets | 1.03% | 0.00% | -19.15% | 0.00% | -1.33% | 0.00% |
Conclusion: No noticeable performance change between Rails 4 and Rails 5.
There is some concern that upgrading Rails and disabling MD5 could have an adverse affect on Conjur performance. While I'm not terribly worried, verification is absolutely warranted. The IL team has created a set of performance tests that we should be able to easily leverage. Details to come soon.
The load test should execute the following scenario:
Setup
conjurdemos/dap-intro
):policy/users.yml
(into theroot
namespace, using replace)policy/policy.yml
(into theroot
namespace using append)policy/apps/myapp.yml
(into thestaging
namespace using append)policy/apps/myapp.yml
(into theproduction
namespace using append)policy/application_grants.yml
(into theroot
namespace using append)policy/hosts.yml
(into theroot
namespace using append)test-host-1
API key for future useproduction/myapp/database/username
production/myapp/database/password
production/myapp/database/url
production/myapp/database/port
Load Test Perform the following actions:
test-host-1
API key and retrieve a tokenproduction/myapp/database
credentials using batch retrievalPlease run the test against the
11.4
DAP appliance, then again on the latest DAP build, posting results back to this issue.We'll run this test in the future, so please make sure the JMeter script is checked into a repository.