logstash-plugins / logstash-output-elasticsearch

https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
Apache License 2.0
219 stars 305 forks source link

LogStash needs to specify the index in the bulk request URL to comply with Elastic's recommendation for URL-based Access Control #194

Open cfeio opened 9 years ago

cfeio commented 9 years ago

Background: How Elasticsearch supports URL-based access control

ElasticSearch is designed to allow URL-based access control to secure access to Elasticsearch indices. This is especially useful in a multitenant environment where tenant A should only be able to access only their indices, not tenant B's.

By default, multi-search, multi-get, and bulk requests allow the user to specify an index in the URL and on each individual request within the request body. This makes URL-based access control challenging.

In order to prevent the user from overriding the index that is specific in the URL, Elastic allows you to add this setting to the config.yml file:

rest.action.multi.allow_explicit_index: false

The default value is true, but when set to false, Elasticsearch will reject requests that have an explicit index specified in the request body. This means the requests must specify the index in the URL.

Source: https://www.elastic.co/guide/en/elasticsearch/reference/current/url-access-control.html

LogStash's usage of the Bulk API

LogStash's ElasticSearch output uses the bulk index API for improved indexing performance. Here is the bulk API documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

From the documentation we see the different bulk endpoints: /_bulk, /{index}/_bulk, and {index}/{type}/_bulk. LogStash uses /_bulk, without specifying the index it will send data to in the URL.

How LogStash's usage of the _bulk api prevent URL based access security

LogStash's use of the /_bulk api without specifying an index in the URL makes URL based access security impossible. If we give a basic auth account access to the /_bulk url so LogStash can hit it, then any user with those credentials can insert and delete data to any index rather than just their index.

Proposed solution so LogStash is compatible with ElasticSearch's URL based access security

In order to comply with the URL based access control that ElasticSearch has supported, bulk requests from LogStash need to include the index in the request URL instead of in the body.

Because the index is configured in the output section, LogStash already has knowledge of which index it is sending data to. LogStash needs to add that index to the url (/{index}/_bulk) instead of specifying it in the body and using /_bulk.

talevy commented 9 years ago

This is a bit tricky. although The plugin has an :index parameters configured, this can be dynamically generated per event. So it is not necessarily known or uniform across all events being sent in the bulk request.

cfeio commented 9 years ago

Thanks for the quick response talevy. What you are saying make sense, so perhaps my proposed solution isn't the right way to resolve this.

The bottom line is that if a user wants to use Elastic's URL based access control, they need to set this setting:

rest.action.multi.allow_explicit_index: false

If this is set, LogStash can no longer send data into the cluster.

We would really like to see LogStash be able to support ElasticSearch setup with URL based access control, and we don't particularly care too much on the specifics as to how exactly this is accomplished as long as we can configure it to be compatible with the access control.

brendangibat commented 8 years ago

Just a heads up, this ticket may be more important to a few other Logstash users since AWS started providing a hosted ES solution. Their suggestion is to have that setting be false. Others following best recommended practices of AWS would need this functionality as well.

http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-createupdatedomains.html

andrewvc commented 8 years ago

It may be simpler to just add an option separate_requests_per_index, that splits up the request based on the indexes present, and issues N bulk requests for N indexes. @talevy what do you think?

Kaabo commented 8 years ago

I am new to this community. We are also facing this issue. Being admin I should be able to upload data into different ES indexes with Logstash. Then again different tenants in our ES should not be able to see others indexes in Kibana. Therefore we have set:

rest.action.multi.allow_explicit_index: false

Would be good to have this solved somehow.

ghost commented 7 years ago

A couple of years later, and it seems this problem still exists without a solution short of eliminating URL-based access controls. Am I missing something or did this get fixed in some other way, as I am seeing the same error message and have the same difficulty adding events to an AWS-hosted es cluster?

jordansissel commented 7 years ago

@sgendler-stem This issue doesn't come up much internally from what I can remember. I'm open to adding support for it, though I do not have any estimate on how long it would take or when it would be worked on. If someone else provides a patch to implement this, I'm open to reviewing it.

andrewvc commented 7 years ago

@sgendler-stem I actually do think this works now with the new bulk_path option.

Can you give it a shot and let us know what you find?

ghost commented 7 years ago

I'll check it out. Unfortunately, I suspect the reason you don't hear a request for this feature very often is because it isn't possible to use this plugin with amazon's hosted elasticsearch because requests need to be signed with amazon IAM keys, so I am using the logstash-output-amazon-es plugin - I just had to forgo preventing users from 'lying' about the index they are bulk inserting into by allowing the index to be set on each event in a bulk request. I'm not sure if the amazon-es plugin is based on this plugin or not. Hopefully it is, which might mean I can concoct a way to build it based on a version that supports bulk_path, but I'm unsure. I may have to manually port the changes across. But it certainly looks like the bulk_path option will correct the problem based on the documentation. I'm posting here mostly so that others might find this comment. If I can figure it out without too much time spent, I'l port the changes into the amazon-es output plugin and submit a pull request to that plugin as well.

TclasenITVT commented 6 years ago

I testet the bulk_path option yesterday and I got the retryable Error and a Code 400. I proceeded testing a sample Request using Postman and that worked. I don't know if this observation is of any use to somebody out there but it seems to me as if logstash is doing something differently maybe sending the index by default in its request body even with the bulk_path option set or something like that.