opensearch-project / opensearch-sdk-java

OpenSearch SDK to build and run extensions
Apache License 2.0
28 stars 58 forks source link

[FEATURE] Add support for headers for ExtensionRestRequest #431

Closed owaiskazi19 closed 1 year ago

owaiskazi19 commented 1 year ago

Is your feature request related to a problem?

Currently, there's no provision to fetch headers from ExtensionRestRequest which is required for various features of Anomaly Detector plugin and for the future.

What solution would you like?

Create

  1. header - To get the value of the header
  2. getAllHeaderValues - To get all values for the header
  3. getHeaders - To get all of the headers and values associated with the headers

These methods are similar to what RestRequest currently has.

What alternatives have you considered?

A clear and concise description of any alternative solutions or features you've considered.

Do you have any additional context?

Add any other context or screenshots about the feature request here.

dbwiddis commented 1 year ago

Note the comment in RestSendToExtensionAction where headers will be passed:

// HERE BE DRAGONS - DO NOT INCLUDE HEADERS
// SEE https://github.com/opensearch-project/OpenSearch/issues/4429
new ExtensionRestRequest(method, path, params, contentType, content, requestIssuerIdentity),

We cannot send the headers unfiltered. In particular we need to remove security-sensitive information such as basic authentication (we do not want to send user/password to extensions).

So yes, we need headers. No we can't just copy them blindly from the REST request. Yes this need a lot of eyes on it to make sure we get it right. Tagging @davidlago @peternied @cwperks @scrawfor99 to provide 8 more eyes on this issue.

dbwiddis commented 1 year ago

Directly linking the issue in the quoted comment above for your clicking ease: https://github.com/opensearch-project/OpenSearch/issues/4429

dbwiddis commented 1 year ago

https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication

dbwiddis commented 1 year ago

https://www.baeldung.com/java-httpclient-basic-auth

stephen-crawford commented 1 year ago

Hi @owaiskazi19, thank you for filing this issue. As @dbwiddis mentioned, there is a lot of complexity that comes into play when you are trying to deal with passing any headers to extensions. I am not sure if there have been discussions elsewhere on GitHub about potential solutions but I don't know that I see an obvious choice. Unfortunately, it comes down to restricting information enough but not too much.

One possibility is to allow for a moving definition of what "too much" is. We could make extension installation come with an explanation of the information which the extension uses. Users would then need to agree or decline to accept that use. Unfortunately, this does not really solve the problem of "what is the right amount" but instead punts it down the road and places the burden of judgement on the end user. It is also somewhat counter-productive to a major idea of extensions which seeks to make the installation of extensions lightweight and stress free...

Perhaps you have some thoughts on how you would be interested in seeing these restrictions handled? Your perspective as an extension author would be invaluable.

Thank you.

cwperks commented 1 year ago

This sounds like a CORS problem where the admin of the OpenSearch Cluster needs to be empowered to define what headers are permissible. This may be done with different levels of granularity. Maybe by extension or by route + verb?

Access-Control-Allow-Headers is a CORS header used to reply to a client making a pre-flight OPTIONS request with what headers the server permits: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Headers

CORS example with nginx: https://enable-cors.org/server_nginx.html

Tomcat CORS filter: https://tomcat.apache.org/tomcat-7.0-doc/config/filter.html#CORS_Filter

dbwiddis commented 1 year ago

This sounds like a CORS problem where the admin of the OpenSearch Cluster needs to be empowered to define what headers are permissible.

My first thought was "hey, that sounds like a good idea, let a user configure if they want to pass, say, the Authentication header." But then I realized "all we need is a hacker or dumb cluster admin and we will suddenly expose username and password information to any extension".

I think there is already some built-in header filtering functionality that can be leveraged, but IMO we should always, without user override, strip the RFC 7235 "Authentication" header, if not others.

cwperks commented 1 year ago

@dbwiddis Do we have a full list of headers to block always? Authorization is the header that carries basic auth info and should never be forwarded to an extension. If a user is making a request and misspells Authorization as Authorisation, can we make sure that is not inadvertently forwarded?

owaiskazi19 commented 1 year ago

Thanks @cwperks @dbwiddis and @scrawfor99 for your valuable inputs. This looks like a big fish in the pond. I will go ahead and close my PR on OpenSearch.

dbwiddis commented 1 year ago

@dbwiddis Do we have a full list of headers to block always? Authorization is the header that carries basic auth info and should never be forwarded to an extension. If a user is making a request and misspells Authorization as Authorisation, can we make sure that is not inadvertently forwarded?

Do we want to use Levenshtein Distance as a filter? Do I actually get to implement one of those job interview algorithms? :D

Seems a better/safer idea to explicitly define a list of headers we need/allow (possibly make it a configurable setting). @owaiskazi19 WDYT?

owaiskazi19 commented 1 year ago

Seems a better/safer idea to explicitly define a list of headers we need/allow (possibly make it a configurable setting). @owaiskazi19 WDYT?

Just for my understanding how are we filtering headers currently in OpenSearch/Security?

cwperks commented 1 year ago

Hi @owaiskazi19, I found 2 areas in the security plugin that relate to HTTP Headers:

  1. When audit logging requests, the Authorization header is removed: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/auditlog/impl/AuditMessage.java#L354-L362
  2. REST requests that contain headers starting with _opendistro_security_ are rejected: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/filter/SecurityRestFilter.java#L146-L152
owaiskazi19 commented 1 year ago

Thanks @cwperks. Looks like for OpenSearch also we are filtering with OPENDISTRO_SECURITY_CONFIG_PREFIX. As @dbwiddis mentioned, we can define allowList and denyList for the headers then.

dbwiddis commented 1 year ago

Hey @owaiskazi19 I will implement this, can you give me a list of the headers you need for your AD search detector work?

owaiskazi19 commented 1 year ago

Hey @owaiskazi19 I will implement this, can you give me a list of the headers you need for your AD search detector work?

One such use case I have encountered for SearchDetector is this. Will add more once I start the work for SearchDetector.

dbwiddis commented 1 year ago

@owaiskazi19 I'm still not sure where actual headers are required to be passed from the RestRequest into the ExtensionRestRequest, although I do see some method needs.

The one case you have identified above calls contentOrSourceParamParser() which returns an XContentParser.

That comes from the URI, like this.

Params are already passed so you should be able to fetch that using something like XContentType.fromMediaType(typeParam)

Looking into the entire RestRequest class, the only place I see headers parsed is here:

    public static XContentType parseContentType(List<String> header) {
        if (header == null || header.isEmpty()) {
            return null;
        } else if (header.size() > 1) {
            throw new IllegalArgumentException("only one Content-Type header should be provided");
        }

That is only called from two places. One is the singleton list generated from the param (not headers) above. The other is the constructor which sets the instance field xContentType based on the "Content-Type" header:

        final XContentType xContentType;
        try {
            xContentType = parseContentType(headers.get("Content-Type"));
        } catch (final IllegalArgumentException e) {
            throw new ContentTypeHeaderException(e);
        }

This instance field value is already included in the ExtensionRestRequest created from the original RestRequest, so the result of that header parsing is also available to you.

So for this particular use case, I don't think you need the headers, and this issue shouldn't be a blocker.

Certainly there is room to implement some of these methods on ExtensionRestRequest and I can work on doing that. I've wondered if ExtensionRestRequest should extend RestRequest and this seems like a valid reason for why that should be; I think going that way may be the best way to get you the method you need. WDYT?

dbwiddis commented 1 year ago

I've wondered if ExtensionRestRequest should extend RestRequest

I think this is probably the best path forward. Here are the constructor needs:

    protected RestRequest(
        NamedXContentRegistry xContentRegistry,
        Map<String, String> params,
        String path,
        Map<String, List<String>> headers,
        HttpRequest httpRequest,
        HttpChannel httpChannel
    ) {

We already pass params and path and some parts of the HttpRequest (we could add the rest, with header filtering). We could pass the SDKNamedXContentRegistry as well to populate that field new with every request. And I don't think we need channel.

If we just do a denyList for the headers mentioned above AND an allowlist currently only containing Content-Type but which we can carefully add to, this would be plug-and-play (and reduce the diff even more).

nassipkali commented 1 year ago

Hello @dbwiddis, I would like to take this issue

dbwiddis commented 1 year ago

Hey @nassipkali thanks for stepping up!

When I initially implemented Rest Handling on Extensions I used one class (ExtensionRestRequest) for both the transport communication and for the Rest handlers. This required copying over a bunch of methods, but as this issue points out, we need even more.

So the approach we will be taking is twofold:

  1. Keep the ExtensionRestRequest class as a means of communicating the components of a REST request from Opensearch to the extension
  2. When we receive and deserialize the request, we will instantiate a new RestRequest object using the constructor mentioned in this comment which already has all the needed methods.

So here's a step-by-step:

  1. Modify the ExtensionRestRequest class (on OpenSearch) to include a Map<String, List<String>> headers. You will add an instance field, add a parameter to the constructor, add a getter, and add processing the map to the writeTo() method and the constructor taking a Streaminput Param. Please add a comment in the javadoc saying to not include security sensitive information in the headers!
  2. Update the RestSendToExtensionsAction class (On OpenSearch) where it sends the ExtensionRestRequest. Create a map here to send as a parameter, but do not just copy the RestRequest headers. For now we only want the Content Type header.
  3. On SDK, update the ExtensionsRestRequestHandler class. You will be inserting new code at the top of this method: https://github.com/opensearch-project/opensearch-sdk-java/blob/ded6ed00cd06cb4d710a356d877e31a346925660/src/main/java/org/opensearch/sdk/handlers/ExtensionsRestRequestHandler.java#L51

You will use the getters on the parameter ExtensionRestRequest request to retrieve the information that was sent in steps 1 and 2 above, and you will create a new instance of RestRequest from OpenSearch using this constructor:

    protected RestRequest(
        NamedXContentRegistry xContentRegistry,
        Map<String, String> params,
        String path,
        Map<String, List<String>> headers,
        HttpRequest httpRequest,
        HttpChannel httpChannel
    ) {
  1. The params and path are already being sent, and you will be adding headers in the first steps above, just use the getters on the request.
  2. For the xContentRegistry, you will change the constructor to add a parameter SDKNamedXContentRegistry to save as a private instance field, and populate this parameter in the RestRequest using thegetRegistry()` method on it.
  3. You won't use the httpChannel, so you can leave that null.
  4. That leaves the HttpRequest which is an interface. You'll need to have an implementing class that implements the required methods. I haven't exactly figured out how I want to do this yet, so feel free to try something yourself here.
    • We may want to create a separate implementing class in the SDK, like SDKHttpRequest implements HttpRequest). Currently we have implemented method() and content() (see ExtensionRestRequest). For the other things, you can either return sensible defaults (we don't need strictCookies()) or do further modifications to the serialized info you're sending with the ExtensionRestRequest such as protocol version.

This should be enough info to get started, let me know if you have any questions!

dbwiddis commented 1 year ago

Oh, almost forgot.... after you do the above you'll replace the parameter in this line with the new request you created: https://github.com/opensearch-project/opensearch-sdk-java/blob/ded6ed00cd06cb4d710a356d877e31a346925660/src/main/java/org/opensearch/sdk/handlers/ExtensionsRestRequestHandler.java#L66

And you'll be replacing the handling in HelloWorld example to the new class. One thing that will definitely change is the xContentParser will be simplified to match the way it currently is done in any of the OpenSearch classes extending BaseRestHandler implementing RestHandler.

nassipkali commented 1 year ago

@dbwiddis Is 430 issue connected to this issue? Do we need to sync our work on #430 and #431 issues

dbwiddis commented 1 year ago

@nassipkali no, 430 will be dealing with work in the SDKClient. The only interaction would be in actual REST handler implementation where when processing a RestRequest you used some information in it to send to client.execute().

dbwiddis commented 1 year ago

IF you want do do an initial PR just as far as the first comment (creating the RestRequest object), and a second PR doing the actual SDK implementation (sending that to rest handlers), that would also be welcome.

owaiskazi19 commented 1 year ago

@nassipkali I have opened draft PRs on OpenSearch and SDK for you to refer and get started. The work is not fully finished but the changes will help to get moving and then you can create new PRs or push it to my existing PRs. Anyway you like

nassipkali commented 1 year ago

@dbwiddis @owaiskazi19 how denyList and allowList should be configured? Is it predefined or it should be pass as params or read from config file

nassipkali commented 1 year ago

@dbwiddis I'm stuck on the theoretical part of the problem. I think I'm missing some technical details on how I should filter headers.

dbwiddis commented 1 year ago

@dbwiddis @owaiskazi19 how denyList and allowList should be configured? Is it predefined or it should be pass as params or read from config file

We don't have them yet. They can be defined however you want. For now, and for simplicity, it's probably easier to just leave a TODO comment about implementing them later, and just create a new headers object from scratch, containing only the value of the Content Type header.

@dbwiddis I'm stuck on the theoretical part of the problem. I think I'm missing some technical details on how I should filter headers.

The headers are in a Map<String, List<String>> that you could retrieve via getHeaders(). You could iterate over the entryset and only keep the one matching the key "Content Type".

Alternately, since we are only doing that one header, you could call List<String> getAllHeaderValues(String name) with the content type string as the name, get those header values, and insert it into the new map you are creating.

dbwiddis commented 1 year ago

Optionally, if you wanted an allowlist variable, you could do List.of("Content-Type") and do a denylist List.of("Authorization", "Proxy-Authorization"). Then you'd process the map entries and filter something like this (typing from memory, this may not compile, just a suggestion):

restRequest.getHeaders().entrySet().stream()
    .filter(e -> !denyList.contains(e.getKey())
    .filter(e -> allowList.contains("*") || allowlist.contains(e.getKey()))
    .collect(Collectors.toMap());
nassipkali commented 1 year ago

Optionally, if you wanted an allowlist variable, you could do List.of("Content Type") and do a denylist List.of("Authentication"). Then you'd process the map entries and filter something like this (typing from memory, this may not compile, just a suggestion):

restRequest.getHeaders().entrySet().stream()
    .filter(e -> !denyList.contains(e.getKey())
    .filter(e -> allowList.contains("*") || allowlist.contains(e.getKey()))
    .collect(Collectors.toMap());

Thanks, I will try something like this.

dbwiddis commented 1 year ago

Also note I just looked them up and fixed the actual headers we want to allow/block:

dbwiddis commented 1 year ago

Hey @nassipkali I'm going to take on the 2nd part of this issue (creating the new RestRequest and integrating it into SDK) to take some of the time pressure off you to complete the first part (copying the headers over in the ExtensionRestRequest). That will allow us to move forward with some dependent work, and once your work is done it will plug right in.

nassipkali commented 1 year ago

Hey @nassipkali I'm going to take on the 2nd part of this issue (creating the new RestRequest and integrating it into SDK) to take some of the time pressure off you to complete the first part (copying the headers over in the ExtensionRestRequest). That will allow us to move forward with some dependent work, and once your work is done it will plug right in.

Ok then I send pull request to OpenSearch

nassipkali commented 1 year ago

Hey @nassipkali thanks for stepping up!

When I initially implemented Rest Handling on Extensions I used one class (ExtensionRestRequest) for both the transport communication and for the Rest handlers. This required copying over a bunch of methods, but as this issue points out, we need even more.

So the approach we will be taking is twofold:

1. Keep the `ExtensionRestRequest` class as a means of communicating the components of a REST request from Opensearch to the extension

2. When we receive and deserialize the request, we will instantiate a new `RestRequest` object using the constructor mentioned [in this comment](https://github.com/opensearch-project/opensearch-sdk-java/issues/431#issuecomment-1476973096) which already has all the needed methods.

So here's a step-by-step:

1. Modify the [`ExtensionRestRequest`](https://github.com/opensearch-project/OpenSearch/blob/f40fa82e3667825fd0ba8d4bb17b6959ff7cc4f1/server/src/main/java/org/opensearch/extensions/rest/ExtensionRestRequest.java#L37) class (on OpenSearch) to include a `Map<String, List<String>> headers`.  You will add an instance field, add a parameter to the constructor, add a getter, and add processing the map to the `writeTo()` method and the constructor taking a Streaminput Param.    Please add a comment in the javadoc saying to not include security sensitive information in the headers!

2. Update the [`RestSendToExtensionsAction`](https://github.com/opensearch-project/OpenSearch/blob/f40fa82e3667825fd0ba8d4bb17b6959ff7cc4f1/server/src/main/java/org/opensearch/extensions/rest/RestSendToExtensionAction.java#L171-L173) class (On OpenSearch) where it sends the `ExtensionRestRequest`.   Create a map here to send as a parameter, but do not just copy the RestRequest headers.  For now we only want the `Content Type` header.

3. On SDK, update the [ExtensionsRestRequestHandler](https://github.com/opensearch-project/opensearch-sdk-java/blob/main/src/main/java/org/opensearch/sdk/handlers/ExtensionsRestRequestHandler.java) class.  You will be inserting new code at the top of this method:
   https://github.com/opensearch-project/opensearch-sdk-java/blob/ded6ed00cd06cb4d710a356d877e31a346925660/src/main/java/org/opensearch/sdk/handlers/ExtensionsRestRequestHandler.java#L51

You will use the getters on the parameter ExtensionRestRequest request to retrieve the information that was sent in steps 1 and 2 above, and you will create a new instance of RestRequest from OpenSearch using this constructor:

    protected RestRequest(
        NamedXContentRegistry xContentRegistry,
        Map<String, String> params,
        String path,
        Map<String, List<String>> headers,
        HttpRequest httpRequest,
        HttpChannel httpChannel
    ) {
1. The `params` and `path` are already being sent, and you will be adding headers in the first steps above, just use the getters on the request.

2. For the xContentRegistry, you will change [the constructor](https://github.com/opensearch-project/opensearch-sdk-java/blob/ded6ed00cd06cb4d710a356d877e31a346925660/src/main/java/org/opensearch/sdk/handlers/ExtensionsRestRequestHandler.java#L41) to add a parameter `SDKNamedXContentRegistry` to save as a private instance field, and populate this parameter in the `RestRequest using the `getRegistry()` method on it.

3. You won't use the httpChannel, so you can leave that null.

4. That leaves the `HttpRequest` which is an interface.  You'll need to have an implementing class that implements the required methods.  I haven't exactly figured out how I want to do this yet, so feel free to try something yourself here.

   * We may want to create a separate implementing class in the SDK, like `SDKHttpRequest implements HttpRequest`).  Currently we have implemented `method()` and `content()` (see `ExtensionRestRequest`).  For the other things, you can either return sensible defaults (we don't need `strictCookies()`) or do further modifications to the serialized info you're sending with the `ExtensionRestRequest` such as  protocol version.

This should be enough info to get started, let me know if you have any questions!

How I can use constructor of RestRequest if it has protected modifier?

dbwiddis commented 1 year ago

How I can use constructor of RestRequest if it has protected modifier?

Just create the SDKHttpRequest similar to (or using the code in) #602 and then pass it as a parameter to the static constructor for the RestRequest:

public static RestRequest request(NamedXContentRegistry xContentRegistry, HttpRequest httpRequest, HttpChannel httpChannel)
dbwiddis commented 1 year ago

And to be clear on the dividing line between our work, I will expect you to create that RestRequest object in the ExtensionRestRequestHandler class handleRestExecuteOnExtensionRequest() method but do nothing with it (yet). I will take over changing the class param in the line ExtensionRestResponse response = restHandler.handleRequest(request); and change from the existing request (the ExtensionRestRequest) to the "new" request you are creating, along with any follow-on integration.

dbwiddis commented 1 year ago

I spent a few minutes seeing what would happen if I passed around the RestRequest object where we currently pass around ExtensionRestRequest and found somewhat of a blocker. We've already used that class in our ExtensionRestResponse class, where we've passed around the consumed parameters. And that class has been released in some OpenSearch 2.x releases as API. Even though nobody else has probably been using them, we're possibly breaking BWC guarantees here.

Also looking at that class there are a lot of things we just don't need. So while I still think we need that object, I think we can just have an object wrapped inside the ExtensionRestRequest object that we can delegate method calls to. That will keep the same API we've been using and we can just add a few more methods as we need them. We can just add a setRestRequest() method to insert the new class we're creating into the ExtensionRestRequest.

Also since we're not passing it around anywhere, we can just use an anonymous subclass, like this code in ExtensionsRestRequestHandler:

public RestExecuteOnExtensionResponse handleRestExecuteOnExtensionRequest(final ExtensionRestRequest request) {

ExtensionRestHandler restHandler = extensionRestPathRegistry.getHandler(request.method(), request.path());
if (restHandler == null) {
    return new RestExecuteOnExtensionResponse(
        NOT_FOUND,
        TEXT_CONTENT_TYPE,
        String.join(" ", "No handler for", request.method().name(), request.path()).getBytes(UTF_8),
        emptyMap(),
        emptyList(),
        false
    );
}

request.setRestRequest(RestRequest.request(
    sdkXContentRegistry.getRegistry(),
    new HttpRequest() {

        @Override
        public Method method() {
            return request.method();
        }

        @Override
        public String uri() {
            // for now, path strips query off uri but probably want to pass the whole uri
            return request.path();
        }

        @Override
        public BytesReference content() {
            return request.content();
        }

        @Override
        public Map<String, List<String>> getHeaders() {
            // for now, once we have a header object can use that
            return Map.of("Content-Type", List.of(request.getXContentType().mediaType()));
        }

        @Override
        public List<String> strictCookies() {
            return Collections.emptyList();
        }

        @Override
        public HttpVersion protocolVersion() {
            // This is two integers, should add to the request
            return null;
        }

        @Override
        public HttpRequest removeHeader(String header) {
            // we don't use
            return null;
        }

        @Override
        public HttpResponse createResponse(RestStatus status, BytesReference content) {
            return null;
        }

        @Override
        public Exception getInboundException() {
            // we don't use
            return null;
        }

        @Override
        public void release() {}

        @Override
        public HttpRequest releaseAndCopy() {
            // we don't use
            return null;
        }
    },
    null
);
dbwiddis commented 1 year ago

And after playing with this a bit more, the best option is not to put this in the request handler, but in the StreamInput constructor of the ExtensionRestRequest object where we have access to all these things.

dbwiddis commented 1 year ago

(We still need to set the xContentRegistry in the handler.)

dbwiddis commented 1 year ago

And after going around in circles I'm back to realizing changing the ExtensionRestRequest is already doing the same BWC breaking stuff that I was trying to avoid, and making it a lot more complicated. SO back to the original plan of a new SDKRestContent object.

nassipkali commented 1 year ago

And after going around in circles I'm back to realizing changing the ExtensionRestRequest is already doing the same BWC breaking stuff that I was trying to avoid, and making it a lot more complicated. SO back to the original plan of a new SDKRestContent object.

What does SDKRestContent?

dbwiddis commented 1 year ago

Typo, or lack of coffee. I was still thinking xContentRegistry. WE need a RestRequest (or subclass) object and we can construct it using a SDKHttpRequest which is the direction you are already going.

dbwiddis commented 1 year ago

Fixed in #643 and companion PRs.