elastic / elasticsearch-hadoop

:elephant: Elasticsearch real-time search and analytics natively integrated with Hadoop
https://www.elastic.co/products/hadoop
Apache License 2.0
10 stars 990 forks source link

Append dynamic custom headers for http requests #626

Open markoutso opened 8 years ago

markoutso commented 8 years ago

Amazon provides an elasticsearch service that I am using http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/what-is-amazon-elasticsearch-service.html through spark.

For security reasons each request much be signed http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/what-is-amazon-elasticsearch-service.html#signing-requests and http://docs.aws.amazon.com/general/latest/gr/sigv4-signed-request-examples.html. It would be nice if I could define the custom headers that amazon requires for each request whenever I call the service from spark.

ebuildy commented 8 years ago

As a workaround you could use a HTTP proxy that could add missing headers.

markoutso commented 8 years ago

Thanks for your response. I am considering that but I am also searching for a simpler solution.

markoutso commented 8 years ago

Hello again, I have a working version of this feature. I am thinking of submitting a pull request but I am sure that my code is not the proper way to do it. I have just made the minimum amount of changes to the original code. How would you go in implementing this? Is it something that the community has interest in? If so I can work to make the code better and cleaner so that it can be merged.

costin commented 8 years ago

Apologies for the late reply . @sstergou contributions are always welcome, even in prototype form (see the contribution guide in the root of the project - it also pops up for each PR). As for the code, it is important that the feature is generic (not tied to Amazon or other libraries) and work across all integrations (not just Spark). I can handle the latter but the former is pretty crucial - an example or POC should be enough to try out the idea.

Cheers,

markoutso commented 8 years ago

A nice way to implement this is to use an interceptor for the apache http client. Unfortunately this feature is not available for the old 3.0.1 client which is forced as a dependency. I suppose that there is a reason behind that but would it be possible to update the http client with out breaking anything?

costin commented 8 years ago

There are various ways to add in the extra headers; however what entity produces the headers? Each job can run on multiple machines, how can that entity be instantiated (and configured) there in order to start signing/enriching the requests?

dwyerk commented 8 years ago

@sstergou would you consider posting your branch on your fork? I'm running into the same issue and would be happy to work on it with you.

markoutso commented 8 years ago

@dwyerk The code is not optimal but it works. If you have any questions we can discuss them in the page of the fork. https://github.com/sstergou/elasticsearch-hadoop/tree/aws-sign

okulkarni-weekendr commented 6 years ago

@costin has there been any update on this regarding adding signature v4 support for es-hadoop yet?

jbaiera commented 6 years ago

@ovk23 There has not been any work done in regards to signature v4 support at this time.

hekaldama commented 6 years ago

@ovk23 did you figure out a workaround at all? If so, would you be able to share what you did? I tried doing https://github.com/inreachventures/aws-signing-request-interceptor but I am pretty sure elasticsearch-hadoop doesn't support interceptor pattern / interface (I am new here)?

Jared-Prime commented 5 years ago

Hi @costin or @jbaiera, are there any plans to support signature v4 support? This can be a very useful feature for AWS users.

fkirill commented 5 years ago

Please have a look at PR https://github.com/elastic/elasticsearch-hadoop/pull/1240

It establishes a generic framework that is functionally somewhat similar to HttpInterceptor, but is specifically suited for this use-case and integrated into the configuration properties.

The rest (original AWS SigV4 request signing) needs to be implemented as a separate module to avoid taking dependency on AWS-specific libraries from this module.

indranil-nanda commented 4 years ago

Hi, This issue is open from 2015. Is there any plant to fix ir or simple workaround. Earlier AWS was supporting both IP based and Role based Authorization for AWS Elasticsearch running on VPC. But now it is only role based. So I believe there is more need for this fix.

nfx commented 4 years ago

subscribing to comments. I wonder why #1240 didn't go through.

codefromthecrypt commented 4 years ago

@xeraa ps this is really hurting zipkin. Can someone at Elastic make it possible for someone to fix this?

nvander1 commented 3 years ago

@mihirsoni Does AWS have any plans to help out with elasticsearch-hadoop now that the opensearch project exists? @opendistro-for-elasticsearch

eshu commented 2 years ago

We still need it... :(