parasjain1 commented 1 year ago

What is the issue? During a direct channel request i.e. a request which originated and is handled by the same node, the security plugin kicks in and deserializes userHeader and originalRemoteAddress objects again hence consuming more time to process the request.

How can one reproduce the bug? A simple bulk request on a multi node cluster should reproduce the issue.

What is the expected behavior? Do not deserialize the objects again for direct channel requests, and re-use the previously deserialized values.

What is your host/environment?

OS v2.3
Security Plugin

Do you have any additional context? SecurityIntercepter.sendMessageDecorate stashes the current thread context due to which the transient headers are cleared. SecurityRequestHandler.messageReceivedDecorate deserializes the userHeader and originalRemoteAddress objects just to put them in thread context's transient headers.

For a direct channel request (channelType == "direct"), SecurityIntercepter.sendMessageDecorate may retain these values in transient headers and SecurityRequestHandler.messageReceivedDecorate will no longer need to deserialize these again.

stephen-crawford commented 1 year ago

[Triage] Hi @parasjain1, thank you for filing this issue. If you would be interested in opening a PR this may be the fastest way to correct the behavior you documented. Please reach out to @cwperks with any additional context.

DarshitChanpura commented 1 year ago

This might not be a straightforward if-else fix as sendRequestDecorate doesn't have direct access to determine whether request is being sent to same node. I'm going to look dive deep further and update my findings here.

Serializing:

Background: Transport service uses direct channelType via localNodeConnection object upon determination that the current DiscoveryNode is a local node when trying to open a connection via openConnection method.

When the request flow reaches sendRequestDecorate, it doesn't have access to channelType or isLocalNode to identify whether the request is going to be handled by same node. Need to take a deeper look into Connection.java to understand how this value can be made accessible to SecurityInterceptor.

Deserializing:

Since messageReceivedDecorate has access to TransportChannel object and already has a check in place for direct channelType we can just remove the call to deserialize userHeader and originalRemoteAddress

DarshitChanpura commented 1 year ago

Update:

One possible solution to solve the serializing approach is to compare the ephemeralIds of localNode and connection Node. This is derived for DiscoveryNode.java's equals() method to compare two nodes.

I will raise a PR for this.

parasjain1 commented 1 year ago

HeaderHelper.isDirectRequest() has the logic to determine if the current request is a direct (local node) request. It relies on a header OPENDISTRO_SECURITY_CHANNEL_TYPE in ThreadContext. IMO it might solve the issue of identifying whether the request is direct.

mgodwan commented 1 year ago

@parasjain1

OPENDISTRO_SECURITY_CHANNEL_TYPE header is populated based on transport channel when message is received. Is the header populated while sending request as well?
While sending message, given the interceptors may run before the transport channel is created, we may not know the channel type when security interceptor is invoked. is it not the case?

parasjain1 commented 1 year ago

Checked this, and this indeed is the flow. The method name HeaderHelper.isDirectRequest() lead me to believe that this would work anywhere in the security plugin flow. Thanks for pointing out @mgodwan.

DarshitChanpura commented 1 year ago

@parasjain1 @mgodwan I have raised a PR for this fix. Please review: #2765

DarshitChanpura commented 1 year ago

Post fix observations:

Profiling Security Transport Interceptors

Prerequisites:

Download async-profiler package: https://github.com/async-profiler/async-profiler
Python 3

Pre https://github.com/opensearch-project/security/pull/2765

Setup:

Fetch latest changes from 2.8 in OpenSearch and Security repos (this branch doesn’t contain #2765 related changes)
Follow DEVELOPER_GUIDE to get OpenSearch up and running with security plugin installed

Steps:

Get the opensearch process id: ps aux | grep ‘opensearch’
Run the profiler: /profiler.sh -e wall -t -f without-changes.html -d 20 -i 50us -I 'Security.' -I 'java/' -X 'Unsafe.park' 60626
Run bulk-requests. python3 buk-index.py

Results:

Open without-changes.html

sendMessageDecorate takes .18% CPU (.16% is on the thread shown above)

Image (1)

messageReceivedDecorate takes 1.12% of the CPU

Post the fix: https://github.com/opensearch-project/security/pull/2765

Setup:

Fetch latest changes from main in OpenSearch and Security repos
Following DEVELOPER_GUIDE get OpenSearch up and running with security plugin installed

Steps:

Get the opensearch process id: ps aux | grep ‘opensearch’
Run the profiler: /profiler.sh -e wall -t -f with-changes.html -d 20 -i 50us -I 'Security.' -I 'java/' -X 'Unsafe.park' 60626
Run bulk-requests. python3 buk-index.py

Results:

Open with-changes.html

Image (2)

sendMessageDecorate takes .04% CPU (.03% is on the thread shown above)

Image (3)

messageReceivedDecorate consumes 1.02% CPU

Conclusion:

We can infer that for smaller number of bulk requests (100k) we see a decrease of .15% of CPU usage. Although not highly visible this number would rise when number of requests increase significantly.

Appendix:

bulk-index.py

``` from opensearchpy import OpenSearch from opensearchpy import helpers import random as r import requests import json import base64 proto = 'https' host = 'localhost' port = *9200 *auth = ('admin', 'admin') # For testing only. Don't store credentials in code. # Create the client with SSL/TLS enabled, but hostname verification disabled. client = OpenSearch( hosts = [{'host': host, 'port': port}], http_compress = *True*, # enables gzip compression for request bodies http_auth = auth, use_ssl = *True*, verify_certs = *False*, ssl_assert_hostname = *False*, ssl_show_warn = *False *) # Create an index with non-default settings. index_name = 'movies' docs_to_index = *100000 * # Uncomment to create many indices # for i in range(100): # response = client.indices.create(f'movies-{i}') # print('\nCreating index:') # print(response) # document = { # 'title': 'Moneyball', # 'director': 'Bennett Miller', # 'year': '2011' # } # response = client.index( # index = f'movies-{i}', # body = document, # id = '1', # refresh = True # ) title_choices = ['Titanic', 'Jurassic Park', 'Star Wars'] year_choices = ['2013', '1992', '2023', '2001', '1985'] lorem = [ 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.', 'Quisque vitae varius ex, eu volutpat orci.', 'Aenean ullamcorper orci et vulputate fermentum.', 'Cras erat dui, finibus vel lectus ac, pharetra dictum odio.', 'Nullam tempus scelerisque purus, sed mattis elit condimentum nec.', 'Etiam risus sapien, auctor eu volutpat sit amet, porta in nunc.', 'Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas.', 'Proin ipsum purus, laoreet quis dictum a, laoreet sed ligula.', 'Integer ultricies malesuada quam.', 'Cras vel elit sed mi placerat pharetra eget vel odio.', 'Duis ac nulla varius diam ultrices rutrum.' ] basic_auth_token = base64.b64encode('admin:admin'.encode()).decode() headers = { 'Content-Type': 'application/json', 'Authorization': f'Basic *{*basic_auth_token*}*' } bulk = [] for i in range(docs_to_index): document = { 'title': r.choice(title_choices), 'year': r.choice(year_choices), 'description': r.choice(lorem) } bulk.append({"_index": index_name, "_id": f"*{*i*}*", "_source": document}) if (i+*1*) % *100* == *0*: print(f'*\n*Adding documents *{*i - *99**}* - *{*i*}*') try: helpers.bulk(client, bulk) except Exception as e: print(f"An exception occurred while indexing: *{*e*}*") bulk = [] ```

DarshitChanpura commented 1 year ago

A CI bug was introduced in the original PR: #2765 which caused flaky CI. This fix was addressed via: https://github.com/opensearch-project/security/pull/3066

opensearch-project / security

[Performance Bug] Repeated deserialization of headers for a direct channel request #2724

Profiling Security Transport Interceptors

Prerequisites:

Pre https://github.com/opensearch-project/security/pull/2765

Post the fix: https://github.com/opensearch-project/security/pull/2765

Conclusion:

Appendix: