Norconex / committer-elasticsearch

Implementation of Norconex Committer for Elasticsearch.
https://opensource.norconex.com/committers/elasticsearch/
Apache License 2.0
11 stars 6 forks source link

Committer for AWS Elastic Search #27

Open santhoshgit13 opened 6 years ago

santhoshgit13 commented 6 years ago

Hi,

I am using the committer for my local elasticsearch instance and is working perfectly fine. I am trying to commit to AWS ElasticSearch so now where should i give the AWS Key and Password to connect to AWS.

Is there any example or any documentation to use AWS ElasticSearch instead of a local instance.

Thanks

essiembre commented 6 years ago

You should be able to commit to Elasticsearch on AWS without issues (we did it often). The configuration is the same as a local instance. However, you need to make sure that AWS Elasticsearch and the server where you crawl from are behind the same firewall and the crawler has access to AWS Elasticsearch.

wolverline commented 6 years ago

@essiembre If Elasticsearch is hosted in AWS, Norconex may not go through IAM role based authentication. AWS Elasticsearch usually accepts an Access/Secret Keys as you set in Amazon ClouldSearch committer.

<accessKey>...</accessKey>
<secretKey>...</secretKey>

In this case, do I need to extend the classes and add this? How do I go about this?

essiembre commented 6 years ago

Prior to using the Committer, how were you accessing your Elasticsearch?

You can definitely extend the class or write a custom Committer, but I recommend a simpler approach. You can configure your IAM to restrict access to the originating IP instead. If you make it the IP of the server where you run the Collector from, you should be fine.

You can find more details about this option here: https://aws.amazon.com/blogs/security/how-to-control-access-to-your-amazon-elasticsearch-service-domain/

jamieshiz commented 6 years ago

Hi @essiembre, I am having a similar issue. I am running the Norconex Collector locally on a docker container and trying to use the ElasticSearch committer to push the indexed files to AWS ElasticSearch.

The CloudSearch Committer has clear documentation that corresponds to AWS endpoints such as:

<!-- Mandatory: -->
<documentEndpoint>...</documentEndpoint>

<!-- Mandatory if not configured elsewhere: -->
<accessKey>...</accessKey>
<secretKey>...</secretKey>

I am not seeing similiar AWS endpints within the ElasticSearch committer. Any help is appreciated. Thanks!