opensearch-project / logstash-output-opensearch

A Logstash plugin that sends event data to a OpenSearch clusters and stores as an index.
https://opensearch.org/docs/latest/clients/logstash/index/
Apache License 2.0
104 stars 80 forks source link

403 errors when delivering logs to AWS OpenSearch Serverless after a successful period #228

Open steven-cherry opened 12 months ago

steven-cherry commented 12 months ago

Describe the bug I'm attempting to deliver logs to AWS Opensearch Serverless. I'm running logstash as a deployment on AWS EKS. I'm attempting to using the IAM role to attached to the EKS EC2 node that's running the associated pod to authenticate with Opensearch serverless. When I start the deployment/pod up it successfully delivers messages into Opensearch serverless, however after a short period, (20 seconds - 5 minutes) logs fail to be delivered to Opensearch serverless with 403 errors e.g.

[2023-08-31T11:18:47,585][ERROR][logstash.outputs.opensearch][main][43ac7955e25a1efb882bfe67309ff3cf447bfc3b85dc94a4119f84872473b07b] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"[https://REDACTED.eu-west-1.aoss.amazonaws.com:443/_bulk ](https://REDACTED.eu-west-1.aoss.amazonaws.com/_bulk) ", :content_length=>52619}

If I stop the deployment/pod and start it again the process repeats itself. Logs can be delivered for a short period after which they are rejected with 403 errors.

My output config is as follows

output {
  opensearch {
    hosts => ["@Model.OpenSearchHost"]
    index => "audit-%{[@@metadata][index]}"
    action => "create"
    auth_type => {
      type => 'aws_iam'
      aws_access_key_id => ''
      aws_secret_access_key => ''
      service_name => 'aoss'
      region => 'eu-west-1'
    }
    default_server_major_version => 2
    legacy_template => false
  }
}

To Reproduce See Above

Expected behavior Logs should be able to be delivered to Opensearch serverless consistently

Plugins none

Screenshots none

Host/Environment (please complete the following information):

Additional context none

dblock commented 12 months ago

Is there a more detailed error behind this 403? In logstash-output-opensearch 2.0.2 I fixed https://github.com/opensearch-project/logstash-output-opensearch/pull/217 which would have produced this kind of error. Let's start with making sure the plugin version is updated?

chadmyers commented 12 months ago

We're getting the same errors -- this and the cluster UUID error in opensearch-project/opensearch-build#186

We ensured that we're using the latest (2.0.2) version of the plugin.

Has anyone got logstash-output-opensearch working against OpenSearch Serverless? What I'm trying to determine is if the problem is on our end (misconfiguration of OSS data access or network policies, etc) or if there's a problem with the plugin. Thanks!

dblock commented 12 months ago

@chadmyers

  1. Could you please copy paste the actual detail of the 403 error you’re getting here from logstash-output-opensearch (one that says more than it’s a 403 and will retry).
  2. Enable debug logging that shows the entry being sent that causes the 403. Let’s see the data.
  3. opensearch-project/opensearch-build#186 is a real problem, we/you/me/someone should fix it - serverless doesn’t have the concept of a cluster so it errors when polling AOSS
dlvenable commented 12 months ago

@chadmyers , The status code 403 indicates that the user making the request (as provided in the access key/secret key) does not have permissions to the resource. Enabling logging as @dblock suggested may yield some additional information. But, it appears that your user does not have permissions to write documents to the AOSS collection.

dblock commented 12 months ago

@dlvenable This bug says that it works for a while, then stops with a 403. So the user has permissions. Similarly, in opensearch-project/opensearch-build#207 we also had a 403, but the problem was not permissions, but the incorrect signing in logstash-output-opensearch; so we need to know what that actual server-side error is that causes the first 403 and what actual data is being sent (to see if we can reproduce inserting that 1 record).

chadmyers commented 12 months ago

Example debug log of a 403:

[2023-09-12T17:08:50,783][DEBUG][org.apache.http.wire     ] http-outgoing-5 >> "{"server_name":"REDACTED","region":"us-west-2","relayer":"managed","program":"search","tags":["tcpjson","nxlog","log4net_dated","Dovetail","customer","REDACTED","test_opsgenie_heartbeat"],"host":"172.REDACTED","ec2_instance_id":"i-REDACTED","log_message":"REDACTED","type":"dovetail-log4net","source_server":"REDACTED (i-REDACTED)","@timestamp":"2023-09-12T17:07:03.237Z","level":"INFO","port":62153,"message":"REDACTED"}[\n]"
[2023-09-12T17:08:50,791][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "HTTP/1.1 403 Forbidden[\r][\n]"
[2023-09-12T17:08:50,791][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "x-request-id: 42972d1c-ecd9-9d4d-bb77-1dfd1a7114bf[\r][\n]"
[2023-09-12T17:08:50,791][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "x-amzn-aoss-test-account-id: REDACTED[\r][\n]"
[2023-09-12T17:08:50,791][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "x-amzn-aoss-test-collection-id: REDACTED[\r][\n]"
[2023-09-12T17:08:50,791][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "content-type: application/json[\r][\n]"
[2023-09-12T17:08:50,793][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "date: Tue, 12 Sep 2023 17:08:50 GMT[\r][\n]"
[2023-09-12T17:08:50,793][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "content-length: 121[\r][\n]"
[2023-09-12T17:08:50,793][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "x-aoss-response-hint: X01:gw-helper-deny[\r][\n]"
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "server: aoss-amazon[\r][\n]"
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "[\r][\n]"
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.wire     ] http-outgoing-5 << "{"status":403,"request-id":"REDACTED","error":{"reason":"403 Forbidden","type":"Forbidden"}}[\n]"
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << HTTP/1.1 403 Forbidden
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << x-request-id: REDACTED
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << x-amzn-aoss-test-account-id: REDACTED
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << x-amzn-aoss-test-collection-id: REDACTED
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << content-type: application/json
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << date: Tue, 12 Sep 2023 17:08:50 GMT
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << content-length: 121
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << x-aoss-response-hint: X01:gw-helper-deny
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.headers  ] http-outgoing-5 << server: aoss-amazon
[2023-09-12T17:08:50,794][DEBUG][org.apache.http.impl.execchain.MainClientExec] Connection can be kept alive indefinitely
[2023-09-12T17:08:50,795][DEBUG][org.apache.http.impl.conn.PoolingHttpClientConnectionManager] Connection [id: 5][route: {s}->https://REDACTED.us-west-2.aoss.amazonaws.com:443] can be kept alive indefinitely
[2023-09-12T17:08:50,795][DEBUG][org.apache.http.impl.conn.DefaultManagedHttpClientConnection] http-outgoing-5: set socket timeout to 0
[2023-09-12T17:08:50,795][DEBUG][org.apache.http.impl.conn.PoolingHttpClientConnectionManager] Connection released: [id: 5][route: {s}->https://REDACTED.us-west-2.aoss.amazonaws.com:443][total available: 1; route allocated: 1 of 100; total allocated: 1 of 1000]
[2023-09-12T17:08:50,796][ERROR][logstash.outputs.opensearch] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>2691, :body=>"{\"status\":403,\"request-id\":\"REDACTED\",\"error\":{\"reason\":\"403 Forbidden\",\"type\":\"Forbidden\"}}\n"}
chadmyers commented 12 months ago

This is another 403 error we get when logstash is starting up - it tries to create the index template.

_index_template Error:
Sep 12 15:10:16 ip-172-30-0-227.us-west-2.compute.internal logstash[306234]: [2023-09-12T15:10:16,459][ERROR][logstash.outputs.opensearch] Failed to install template {:message=>"Got response code '403' contacting OpenSearch at URL 'https://REDACTED.us-west-2.aoss.amazonaws.com:443/_index_template/logstash'", :exception=>LogStash::Outputs::OpenSearch::HttpClient::Pool::BadResponseCodeError, :backtrace=>["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client/manticore_adapter.rb:181:in `perform_request'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client/pool.rb:272:in `perform_request_to_url'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client/pool.rb:259:in `block in perform_request'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client/pool.rb:348:in `with_connection'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client/pool.rb:258:in `perform_request'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client/pool.rb:266:in `block in Pool'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client.rb:393:in `exists?'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client.rb:398:in `template_exists?'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/http_client.rb:78:in `template_install'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/template_manager.rb:37:in `install'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch/template_manager.rb:25:in `install_template'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch.rb:419:in `install_template'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch.rb:254:in `finish_register'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch.rb:231:in `block in register'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/plugin_mixins/opensearch/common.rb:83:in `block in after_successful_connection'"]}

Pipelines starting up:
Sep 12 15:10:24 ip-172-30-0-227.us-west-2.compute.internal logstash[306234]: [2023-09-12T15:10:24,003][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

After pipelines start, 403s start:
Sep 12 15:10:25 ip-172-30-0-227.us-west-2.compute.internal logstash[306234]: [2023-09-12T15:10:25,356][ERROR][logstash.outputs.opensearch] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>9388}
chadmyers commented 12 months ago

I'm following this Workshop from AWS to try to create a repro scenario of sorts, or at least to try to isolate my environment out of things as much as possible. This uses a script to generate a fake httpd.log that then uses the logstash-output-opensearch plugin to spew into Opensearch serverless. I went through this and I'm still getting 403's from my OSS collection.

I edited my Data Access Policy to grant aoss:* for both collections and indexes to my Cloud9 IAM instance profile. I'm not sure how I can open up my Data Access Policy more.

Here's my conf file:

input {
    file {
        path => "/home/ec2-user/environment/serverless-generators/httpd.log"
        start_position => "beginning"
    }
}
filter {
    grok {
      match => { "message" => "%{HTTPD_COMMONLOG}"}
    }
}
output {
    opensearch {
        ecs_compatibility => disabled
        index => "logstash-ingest-%{+YYYY.MM.dd}"
        hosts => "https://REDACTED.us-west-2.aoss.amazonaws.com:443"
        auth_type => {
            type => 'aws_iam'
            region => 'us-west-2'
            service_name => 'aoss'
        }
        legacy_template => true
        default_server_major_version => 2
        timeout => 300
    }
}

(NOTE: when I set legacy_template => false, I get an error about /_index_template. If I set it to true, I don't get that error. I think maybe OSS doesn't support index templates?)

And here's the logstash log output:

chadmyers:~/environment/logstash-8.9.0 $ ./bin/logstash -f logstash-generator.conf
Using bundled JDK: /home/ec2-user/environment/logstash-8.9.0/jdkSending Logstash logs to /home/ec2-user/environment/logstash-8.9.0/logs which is now configured via log4j2.properties[2023-09-12T19:31:37,175][INFO ][logstash.runner          ] Log4j configuration path used is: /home/ec2-user/environment/logstash-8.9.0/config/log4j2.properties[2023-09-12T19:31:37,184][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"8.9.0", "jruby.version"=>"jruby 9.3.10.0 (2.6.8) 2023-02-01 107b2e6697 OpenJDK 64-Bit Server VM 17.0.7+7 on 17.0.7+7 +indy +jit [x86_64-linux]"}
[2023-09-12T19:31:37,188][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djruby.compile.invokedynamic=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, -Djruby.regexp.interruptible=true, -Djdk.io.File.enableADS=true, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[2023-09-12T19:31:37,423][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
/home/ec2-user/environment/logstash-8.9.0/vendor/bundle/jruby/2.6.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:938: warning: constant Tilt::Cache is deprecated
[2023-09-12T19:31:37,815][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2023-09-12T19:31:38,457][INFO ][org.reflections.Reflections] Reflections took 260 ms to scan 1 urls, producing 132 keys and 464 values
[2023-09-12T19:31:40,556][INFO ][logstash.javapipeline    ] Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[2023-09-12T19:31:40,582][INFO ][logstash.outputs.opensearch][main] New OpenSearch output {:class=>"LogStash::Outputs::OpenSearch", :hosts=>["https://REDACTED.us-west-2.aoss.amazonaws.com:443"]}
[2023-09-12T19:31:40,797][INFO ][logstash.outputs.opensearch][main] OpenSearch pool URLs updated {:changes=>{:removed=>[], :added=>[https://REDACTED.us-west-2.aoss.amazonaws.com:443/]}}[2023-09-12T19:31:41,080][WARN ][logstash.outputs.opensearch][main] Restored connection to OpenSearch instance {:url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/"}[2023-09-12T19:31:41,103][INFO ][logstash.outputs.opensearch][main] Cluster version determined (2.0.0) {:version=>2}[2023-09-12T19:31:41,108][WARN ][logstash.filters.grok    ][main] ECS v8 support is a preview of the unreleased ECS v8, and uses the v1 patterns. When Version 8 of the Elastic Common Schema becomes available, this plugin will need to be updated
[2023-09-12T19:31:41,164][ERROR][logstash.outputs.opensearch][main] Unable to retrieve OpenSearch cluster uuid {:message=>"undefined method `[]' for nil:NilClass", :exception=>NoMethodError, :backtrace=>["/home/ec2-user/environment/logstash-8.9.0/vendor/bundle/jruby/2.6.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/plugin_mixins/opensearch/common.rb:91:in `discover_cluster_uuid'", "/home/ec2-user/environment/logstash-8.9.0/vendor/bundle/jruby/2.6.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch.rb:253:in `finish_register'", "/home/ec2-user/environment/logstash-8.9.0/vendor/bundle/jruby/2.6.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/outputs/opensearch.rb:231:in `block in register'", "/home/ec2-user/environment/logstash-8.9.0/vendor/bundle/jruby/2.6.0/gems/logstash-output-opensearch-2.0.2-java/lib/logstash/plugin_mixins/opensearch/common.rb:83:in `block in after_successful_connection'"]}
[2023-09-12T19:31:41,175][INFO ][logstash.outputs.opensearch][main] Using a default mapping template {:version=>2, :ecs_compatibility=>:disabled}
[2023-09-12T19:31:41,195][INFO ][logstash.outputs.opensearch][main] Installing OpenSearch template {:name=>"logstash"}[2023-09-12T19:31:41,293][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/home/ec2-user/environment/logstash-8.9.0/logstash-generator.conf"], :thread=>"#<Thread:0x5f068533@/home/ec2-user/environment/logstash-8.9.0/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[2023-09-12T19:31:41,948][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>0.65}
[2023-09-12T19:31:42,135][INFO ][logstash.inputs.file     ][main] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/home/ec2-user/environment/logstash-8.9.0/data/plugins/inputs/file/.sincedb_b65669b7933b025bbfd364309f68da4c", :path=>["/home/ec2-user/environment/serverless-generators/httpd.log"]}
[2023-09-12T19:31:42,142][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2023-09-12T19:31:42,147][INFO ][filewatch.observingtail  ][main][ef169e362146bc749f9ef74c22cbd7fc7ccba445e7a721438d7be1a98743b5b6] START, creating Discoverer, Watch with file and sincedb collections
[2023-09-12T19:31:42,156][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
^[/home/ec2-user/environment/logstash-8.9.0/vendor/bundle/jruby/2.6.0/gems/manticore-0.9.1-java/lib/manticore/client.rb:284: warning: already initialized constant Manticore::Client::HttpPost
[2023-09-12T19:33:40,853][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>15421}
[2023-09-12T19:33:40,869][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>19606}
[2023-09-12T19:33:42,934][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>15421}
[2023-09-12T19:33:42,943][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>19606}
[2023-09-12T19:33:47,007][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>19606}
[2023-09-12T19:33:47,009][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>15421}
[2023-09-12T19:33:55,067][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>15421}
[2023-09-12T19:33:55,075][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>19606}
[2023-09-12T19:34:11,119][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>19606}
[2023-09-12T19:34:11,125][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>15421}
[2023-09-12T19:34:43,155][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>15421}
[2023-09-12T19:34:43,190][ERROR][logstash.outputs.opensearch][main][7a242b0e6b582dd0155c45949caf60babe9b69f84c9c68ee9f2e0fe01410929f] Encountered a retryable error (will retry with exponential backoff) {:code=>403, :url=>"https://REDACTED.us-west-2.aoss.amazonaws.com:443/_bulk", :content_length=>19606}
chadmyers commented 12 months ago

I added this inline policy to the IAM instance profile of my Cloud9 instance:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowOSSAPI",
            "Effect": "Allow",
            "Action": "aoss:APIAccessAll",
            "Resource": "arn:aws:aoss:us-west-2:REDACTED:collection/REDACTED"
        }
    ]
}

I waited a few minutes, then fired up logstash again, to no avail. I still get 403's on /_bulk

chadmyers commented 12 months ago

This is what I have in my Data Access Policy:

{
    "Rules": [
      {
        "Resource": [
          "collection/REDACTED"
        ],
        "Permission": [
          "aoss:*"
        ],
        "ResourceType": "collection"
      },
      {
        "Resource": [
          "index/*/*"
        ],
        "Permission": [
          "aoss:*"
        ],
        "ResourceType": "index"
      }
    ],
    "Principal": [
      "arn:aws:iam::REDACTED:role/log_indexer_server_role_us-west-2",
      "arn:aws:iam::REDACTED:role/service-role/AWSCloud9SSMAccessRole",
      "arn:aws:iam::REDACTED:role/aws-service-role/cloud9.amazonaws.com/AWSServiceRoleForAWSCloud9"
    ]
  }

I don't know how I can open that up any more.

The only other thing I can think of is that maybe the plugin isn't using the AWS IAM instance credentials of the EC2 instance this is running on?

chadmyers commented 12 months ago

@dblock @dlvenable Have either of you (or anyone) know anyone who's got the logstash-output-opensearch plugin working with OSS? Because as far as I can tell, I've got the OSS data access policy open wide but I can't get the output plugin to return anything other than 403.

The only other thing I can thing of is that it's a problem with Sigv4 somehow. Do you think it needs the IAM key and secret?

dblock commented 12 months ago

Have either of you (or anyone) know anyone who's got the logstash-output-opensearch plugin working with OSS?

I don't know of anyone who has tried ATM, but I will ask around.

Since I'm familiar with the codebase, I will take a look, but I can't promise to do it fast.

kkmr commented 12 months ago

We definitely have gotten logstash working with OpenSearch Serverless. Here is a blog post from folks on the team: https://aws.amazon.com/blogs/big-data/migrate-your-indexes-to-amazon-opensearch-serverless-with-logstash/

chadmyers commented 12 months ago

@kkmr Thank you. There's an AWS Workshop that's very similar to that that I followed with no luck. The only things I can see that are different are:

chadmyers commented 12 months ago

Have either of you (or anyone) know anyone who's got the logstash-output-opensearch plugin working with OSS?

I don't know of anyone who has tried ATM, but I will ask around.

Since I'm familiar with the codebase, I will take a look, but I can't promise to do it fast.

Thank you!

I also had some thoughts about some tests I can run to try to isolate whether there's an issue with the plugin vs. an issue with OSS data access policies. I'll see if I can try to isolate the problem.

chadmyers commented 12 months ago

I'm ashamed/happy to admit that when doing the AWS workshop and Cloud9, the thing that was tripping me up was "Use Temporary Credentials" default. If you uncheck that, then it uses the EC2 instance profile of the Cloud9 server which is how I had configured the Data Access Policy for OSS.

image

So I was able to get that workshop working which makes me thing that the problem was originally my Data Access Policy and then the false error of Cloud9 temporary credentials.

I'm going to go back to my original/primary logstash environment and see if I can get it all working now that I've proven it CAN work and both the plugin and OSS work. I'll report back with specifics

chadmyers commented 12 months ago

OK, it's now working in my primary logstash environment. I flailed around a lot yesterday so I'm not sure which particular thing fixed it, but I suspect it had to do with two things:

FYI - I still get the error about the cluster UUID but it seems harmless .

My Data access policy had this rule:

{
    "Rules": [
      {
        "Resource": [
          "collection/*"
        ],
        "Permission": [
          "aoss:*"
        ],
        "ResourceType": "collection"
      },
      {
        "Resource": [
          "index/*/*"
        ],
        "Permission": [
          "aoss:*"
        ],
        "ResourceType": "index"
      }
    ],
    "Principal": [
      "arn:aws:iam::REDACTED:role/(the role of my logstash index server)",
      "arn:aws:iam::REDACTED:role/service-role/AWSCloud9SSMAccessRole",
      "arn:aws:iam::REDACTED:role/aws-service-role/cloud9.amazonaws.com/AWSServiceRoleForAWSCloud9"
    ]
  }

FYI - Those "Cloud9" roles were for my Cloud9 experiment and aren't required. Also, I think only one is required but I'm not sure which, so I added both and it worked. You can remove the Cloud9 stuff if you're not messing with Cloud 9.

And then the output config in my logstash config file looked like:

output {
    opensearch {
        ecs_compatibility => disabled
        hosts => "https://REDACTED.us-west-2.aoss.amazonaws.com:443"
        auth_type => {
            type => 'aws_iam'
            region => 'us-west-2'
            service_name => 'aoss'
        }
        legacy_template => true
        default_server_major_version => 2
        timeout => 300
    }
}

So I think if you're getting 403 errors, there's something wrong/missing from your Data Access Policy.

dblock commented 11 months ago

I'm glad you got it working @chadmyers. Thanks @kkmr for that link.

  1. Are there any open issues here still or can we close this?
  2. Is there anything that needs/should be documented (maybe in the README) that can help the next person? Care to help?
steven-cherry commented 11 months ago

I will try again today and report back. Please hold back on closing until then. Thanks

steven-cherry commented 11 months ago

I can't easily test this as the opensearchproject/logstash-oss-with-opensearch-output-plugin:latest container on docker hub is using version 2.0.1 of the output plugin

$ docker run -it opensearchproject/logstash-oss-with-opensearch-output-plugin:latest bash
logstash@5ced04563387:~$ ls vendor/bundle/jruby/2.6.0/gems | grep opensearch
aws-sdk-opensearchserverless-1.4.0
aws-sdk-opensearchservice-1.23.0
logstash-output-opensearch-2.0.1-java

Would it be possible for someone on the project to publish an updated container image containing the latest version of the plugin on docker hub @dblock ? Thanks

chadmyers commented 11 months ago
  1. Are there any open issues here still or can we close this?

Steven Cherry above asked us to hold it open while he does some more testing, but I think at least we have proven that the logstash-output-opensearch plugin (at least v2.0.2) does/can work against OpenSearch Serverless as of 13-SEP-2023.

I wanted to document my findings in this GH issue in case someone in the future is having the same problems I was having with 403s (I saw a few other re:Post and Stack Overflow posts about this so I don't think I'm the only one who struggled) so they know that it is possible and you just have to tweak your DAP most likely.

  1. Is there anything that needs/should be documented (maybe in the README) that can help the next person? Care to help?

I can help, yes. I'm thinking of what would be useful here -- maybe mentioning the cluster UUID known-issue and the legacy_templates thing? And also a bit about "If you're getting 403 errors on calls to /_bulk, make sure to check your Data Access Policy ..." ? I could draft something up and/or submit a PR if that's OK?

dblock commented 11 months ago

I could draft something up and/or submit a PR if that's OK?

Yes please

dblock commented 11 months ago

Would it be possible for someone on the project to publish an updated container image containing the latest version of the plugin on docker hub @dblock ?

I don't know how to do that, so I opened https://github.com/opensearch-project/logstash-output-opensearch/issues/230.

I think you should be able to update it inside the docker container too to test, but I am not sure how it's packaged in there and what that takes. If you do figure it out, please do post it here.

steven-cherry commented 11 months ago

@dblock I managed to try using version 2.0.2 logstash-output-opensearch 2.0.2 as you suggested

In logstash-output-opensearch 2.0.2 I fixed https://github.com/opensearch-project/logstash-output-opensearch/pull/217 which would have produced this kind of error. Let's start with making sure the plugin version is updated?

Plus tried using static credentials,

      aws_access_key_id => <REDACTED>
      aws_secret_access_key => <REDACTED>

But in both cases I still have the same problem as I started with

When I start the deployment/pod up it successfully delivers messages into Opensearch serverless, however after a short period, (20 seconds - 5 minutes) logs fail to be delivered to Opensearch serverless with 403 errors e.g.

Still no further forward I'm afraid.

dblock commented 11 months ago

@steven-cherry ok, start by locating the first error and extract the log (hopefully at debug level) from it?

dblock commented 6 months ago

@steven-cherry Did you give up on this or ever made it work? I fixed the harmless uuid error in #237.

steven-cherry commented 6 months ago

@dblock no I gave up in the end