Closed parasjain1 closed 1 year ago
[Triage] Hi @parasjain1, thank you for filing this issue. If you would be interested in opening a PR this may be the fastest way to correct the behavior you documented. Please reach out to @cwperks with any additional context.
This might not be a straightforward if-else fix as sendRequestDecorate
doesn't have direct access to determine whether request is being sent to same node. I'm going to look dive deep further and update my findings here.
Serializing:
Background: Transport service uses direct
channelType via localNodeConnection object upon determination that the current DiscoveryNode is a local node when trying to open a connection via openConnection method.
When the request flow reaches sendRequestDecorate
, it doesn't have access to channelType or isLocalNode to identify whether the request is going to be handled by same node. Need to take a deeper look into Connection.java to understand how this value can be made accessible to SecurityInterceptor.
Deserializing:
Since messageReceivedDecorate
has access to TransportChannel object and already has a check in place for direct
channelType we can just remove the call to deserialize userHeader and originalRemoteAddress
Update:
One possible solution to solve the serializing approach is to compare the ephemeralIds of localNode and connection Node. This is derived for DiscoveryNode.java's equals() method to compare two nodes.
I will raise a PR for this.
HeaderHelper.isDirectRequest() has the logic to determine if the current request is a direct (local node) request. It relies on a header OPENDISTRO_SECURITY_CHANNEL_TYPE
in ThreadContext. IMO it might solve the issue of identifying whether the request is direct.
@parasjain1
OPENDISTRO_SECURITY_CHANNEL_TYPE header is populated based on transport channel when message is received. Is the header populated while sending request as well?
While sending message, given the interceptors may run before the transport channel is created, we may not know the channel type when security interceptor is invoked. is it not the case?
Checked this, and this indeed is the flow. The method name HeaderHelper.isDirectRequest() lead me to believe that this would work anywhere in the security plugin flow. Thanks for pointing out @mgodwan.
@parasjain1 @mgodwan I have raised a PR for this fix. Please review: #2765
Post fix observations:
async-profiler
package: https://github.com/async-profiler/async-profilerSetup:
2.8
in OpenSearch and Security repos (this branch doesn’t contain #2765 related changes)Steps:
ps aux | grep ‘opensearch’
/profiler.sh -e wall -t -f without-changes.html -d 20 -i 50us -I 'Security.' -I 'java/' -X 'Unsafe.park' 60626
python3 buk-index.py
Results:
without-changes.html
sendMessageDecorate
takes .18% CPU (.16% is on the thread shown above)
messageReceivedDecorate
takes 1.12% of the CPU
Setup:
Steps:
ps aux | grep ‘opensearch’
/profiler.sh -e wall -t -f with-changes.html -d 20 -i 50us -I 'Security.' -I 'java/' -X 'Unsafe.park' 60626
python3 buk-index.py
Results:
with-changes.html
sendMessageDecorate
takes .04% CPU (.03% is on the thread shown above)
messageReceivedDecorate
consumes 1.02% CPU
We can infer that for smaller number of bulk requests (100k) we see a decrease of .15% of CPU usage. Although not highly visible this number would rise when number of requests increase significantly.
A CI bug was introduced in the original PR: #2765 which caused flaky CI. This fix was addressed via: https://github.com/opensearch-project/security/pull/3066
What is the issue? During a direct channel request i.e. a request which originated and is handled by the same node, the security plugin kicks in and deserializes
userHeader
andoriginalRemoteAddress
objects again hence consuming more time to process the request.How can one reproduce the bug? A simple bulk request on a multi node cluster should reproduce the issue.
What is the expected behavior? Do not deserialize the objects again for direct channel requests, and re-use the previously deserialized values.
What is your host/environment?
Do you have any additional context?
SecurityIntercepter.sendMessageDecorate
stashes the current thread context due to which the transient headers are cleared.SecurityRequestHandler.messageReceivedDecorate
deserializes the userHeader and originalRemoteAddress objects just to put them in thread context's transient headers.For a direct channel request (channelType == "direct"),
SecurityIntercepter.sendMessageDecorate
may retain these values in transient headers andSecurityRequestHandler.messageReceivedDecorate
will no longer need to deserialize these again.