uken / fluent-plugin-elasticsearch

Apache License 2.0
891 stars 310 forks source link

Support Opensearch v1.0.0 #915

Closed casabre closed 2 years ago

casabre commented 3 years ago

(check apply)

Problem

In July 2021, Opensearch v1.0.0 was released which is actually a fork from Elasticsearch v7.10.2. Unfortunately, the elasticsearch plugin is aborting with the following message

2021-08-19 12:26:47.748600800 +0000 fluent.warn: {"retry_time":7,"next_retry_seconds":"2021-08-19 12:27:46 +0000","chunk":"5c9e8a284bf0bcd61620d279b057a4cd","error":"#<Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure: could not push logs to Elasticsearch cluster ({:host=>\"opensearch-node1\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}, {:host=>\"opensearch-node2\", :port=>9200, :scheme=>\"https\", :user=>\"admin\", :password=>\"obfuscated\"}): The client noticed that the server is not Elasticsearch and we do not support this unknown product.>","message":"failed to flush the buffer. retry_time=7 next_retry_seconds=2021-08-19 12:27:46 +0000 chunk=\"5c9e8a284bf0bcd61620d279b057a4cd\" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error=\"could not push logs to Elasticsearch cluster ({:host=>\\\"opensearch-node1\\\", :port=>9200, :scheme=>\\\"https\\\", :user=>\\\"admin\\\", :password=>\\\"obfuscated\\\"}, {:host=>\\\"opensearch-node2\\\", :port=>9200, :scheme=>\\\"https\\\", :user=>\\\"admin\\\", :password=>\\\"obfuscated\\\"}): The client noticed that the server is not Elasticsearch and we do not support this unknown product.\""}

Since Opensearch in its initial version is Elasticsearch v7.10.2 and Opensearch is stating the there is a backward-compatibility to that specific Elasticsearch version, I assumed that it should work out of the box because I turned also

{
  "persistent": {
    "compatibility": {
      "override_main_response_version": true
    }
  }
}

on which is according to the manual.

Steps to replicate

docker-compose.yml

version: '3'
services:
  opensearch-node1:
    image: opensearchproject/opensearch:latest
    container_name: opensearch-node1
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node1
      - discovery.seed_hosts=opensearch-node1,opensearch-node2
      - cluster.initial_master_nodes=opensearch-node1,opensearch-node2
      - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536 # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
        hard: 65536
    volumes:
      - opensearch-data1:/usr/share/opensearch/data
    ports:
      - 9200:9200
      - 9600:9600 # required for Performance Analyzer
    networks:
      - opensearch-net
  opensearch-node2:
    image: opensearchproject/opensearch:latest
    container_name: opensearch-node2
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node2
      - discovery.seed_hosts=opensearch-node1,opensearch-node2
      - cluster.initial_master_nodes=opensearch-node1,opensearch-node2
      - bootstrap.memory_lock=true
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - opensearch-data2:/usr/share/opensearch/data
    networks:
      - opensearch-net
  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:latest
    container_name: opensearch-dashboards
    ports:
      - 5601:5601
    expose:
      - "5601"
    depends_on:
      - opensearch-node1
      - opensearch-node2
    environment:
      OPENSEARCH_HOSTS: '["https://opensearch-node1:9200","https://opensearch-node2:9200"]'
    networks:
      - opensearch-net

  fluentd-logging-collector:
    image: docker.io/fluent/fluentd-elastic:v1.13-debian-1
    container_name: fluentd
    build: 
      context: ./fluentd
      dockerfile: Dockerfile
    ports:
      - 24224:24224
      - 24224:24224/udp
    depends_on:
      - opensearch-node1
      - opensearch-node2
    volumes:
      - ./fluentd/fluent.conf:/fluentd/etc/fluent.conf
    networks:
      - opensearch-net

volumes:
  opensearch-data1:
  opensearch-data2:

networks:
  opensearch-net:
    driver: bridge

Dockerfile

FROM fluent/fluentd:v1.13-debian-1

# Use root account to use apt
USER root

# below RUN includes plugin as examples elasticsearch is not required
# you may customize including plugins as you wish
RUN buildDeps="sudo make gcc g++ libc-dev" \
 && apt-get update \
 && apt-get install -y --no-install-recommends $buildDeps \
 && sudo gem install fluent-plugin-elasticsearch \
 && sudo gem sources --clear-all \
 && SUDO_FORCE_REMOVE=yes \
    apt-get purge -y --auto-remove \
                  -o APT::AutoRemove::RecommendsImportant=false \
                  $buildDeps \
 && rm -rf /var/lib/apt/lists/* \
 && rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem

USER fluent

fluent.conf

<source>
  @type forward
  @id tcp_forwarder
  @label @es.infrastructure
</source>

<filter **>
  @type stdout
</filter>

<label @es.infrastructure>
  <match **>
    @type elasticsearch
    hosts opensearch-node1,opensearch-node2
    port 9200
    scheme https
    user admin
    password admin
    ssl_verify false
    verify_es_version_at_startup false
    default_elasticsearch_version 7
    logstash_format true
  </match>
</label>

Expected Behavior or What you need to ask

At least accept the initial version of Opensearch v1.0.0 and forward data to it.

Using Fluentd and ES plugin versions

cosmo0920 commented 3 years ago

From Opensearch project page:

Logstash OSS with OpenSearch Output Plugin

This package includes open source Logstash bundled with the OpenSearch output plugin (v1.0.0). The output plugin is compatible with OpenSearch and Open Source versions of Elasticsearch (7.10.2 or lower). The output plugin is also available as a Ruby Gem.

ref: https://opensearch.org/versions/opensearch-1-0-0.html#data-ingest

So, you should use opensource version of elasticsearch-ruby 7.10.2 or lower and it can handle Opensearch v1.0 events ingestions.

Venthe commented 3 years ago

I don't believe that this should be at this point (or ever) handled by this plugin.

Dependency for elasticsearch is explicitly making things difficult; so correct solution IMO would be to either fork this plugin and support fluent-plugin-opensearch, which would require using forked & modified (& not released) opensearch-ruby as described here https://aws.amazon.com/blogs/opensource/keeping-clients-of-opensearch-and-elasticsearch-compatible-with-open-source/ or solution would be for the server to spoof identity.

Hello useragent all over again.

For now I've reported a bug with repro on the OpenSearch side, as linked above.

ref: https://github.com/opensearch-project/OpenSearch/issues/1166

cosmo0920 commented 3 years ago

I posted maintainer opinion on OpenSearch repo: https://github.com/opensearch-project/OpenSearch/issues/1166#issuecomment-933079400

Protopopys commented 3 years ago

@casabre It works with && gem install elasticsearch-api -v 7.13.3 \ && gem install elasticsearch-transport -v 7.13.3 \ && gem install elasticsearch -v 7.13.3 \ && gem install fluent-plugin-elasticsearch -v 5.1.0

casabre commented 3 years ago

Sorry for going into stealth mode but I had to focus on another topic at work. @cosmo0920 @Venthe @Protopopys Thanks for your effort on that side!

@Protopopys I can also confirm that it is working with your proposed configuration! For the time being, this is a quite good workaround but in the end, as @Venthe proposed, an own plugin would be more reasonable.

Thanks a lot!

sdwerwed commented 3 years ago

A workaround: I have tried also bitnami/fluentd helm chart with image bitnami/fluentd:1.13.3-debian-10-r30 and it works, the bitnami/fluentd:1.14.1-debian-10-r0 does not work. It would make sense to have a proper plugin for OpenSearch, it is becoming popular as an open-source alternative to Elasticsearch. Is there any plan for that?

cosmo0920 commented 3 years ago

We're planning to support OpenSearch as a different plugin but it inherits its plugin functionality like as fluent-plugin-aws-elasticsearch-service. To support OpenSearch, we'd like to wait and use forked or the brand new OpenSearch ruby client.

stevehipwell commented 3 years ago

@cosmo0920 based on various OpenSearch comments I suspect that there is a private opensearch-ruby repo in their GitHub organisation which is a work in progress. I think it's possible to get access to one of these private repos.

cosmo0920 commented 3 years ago

Our Calyptia is waiting for the responses from OpenSearch project and its contract to work for. I don't have access permission on the opensearch project. Stay tuned.

jeffb4 commented 2 years ago

No release yet but the new gem's repo is at https://github.com/opensearch-project/opensearch-ruby

ryn9 commented 2 years ago

@cosmo0920

opensearch-ruby was released a couple days ago: https://github.com/opensearch-project/opensearch-ruby/releases/tag/v1.0.0

illidan80 commented 2 years ago

Any ETA for when Fluentd for Opensearch will be released?

cosmo0920 commented 2 years ago

Today, we've released the first public release of fluent-plugin-opensearch v1.0.0: https://rubygems.org/gems/fluent-plugin-opensearch/versions/1.0.0

fluent-plugin-opensearch also supports for Fluentd event ingestions into AWS OpenSearch Service: https://github.com/fluent/fluent-plugin-opensearch#configuration---aws-opensearch-service

cosmo0920 commented 2 years ago

Done!

stevehipwell commented 2 years ago

fluent-plugin-opensearch also supports for Fluentd event ingestions into AWS OpenSearch Service: https://github.com/fluent/fluent-plugin-opensearch#configuration---aws-opensearch-service

@cosmo0920 what does this mean for fluent-plugin-aws-elasticsearch-service?

cosmo0920 commented 2 years ago

fluent-plugin-opensearch also supports for Fluentd event ingestions into AWS OpenSearch Service: https://github.com/fluent/fluent-plugin-opensearch#configuration---aws-opensearch-service

@cosmo0920 what does this mean for fluent-plugin-aws-elasticsearch-service?

This insists that fluent-plugin-aws-elasticsearch-service should be predecessor plugin for AWS OpenSearch Service. Recently, @atomita response is brutally slow, I decided to import his code as AWS related stuffs on fluent-plugin-opensearch.

stevehipwell commented 2 years ago

This insists that fluent-plugin-aws-elasticsearch-service should be predecessor plugin for AWS OpenSearch Service. Recently, @atomita response is brutally slow, I decided to import his code as AWS related stuffs on fluent-plugin-opensearch.

@cosmo0920 I opened atomita/fluent-plugin-aws-elasticsearch-service#78 to track this.

cosmo0920 commented 2 years ago

@cosmo0920 I opened atomita/fluent-plugin-aws-elasticsearch-service#78 to track this.

No, it needn't to inherit fluent-plugin-opensearch. OpenSearch plugin already uses AWS SignV4 signer code and fluent-plugin-opensearch itself can handle AWS Sign V4 request via faraday middleware without "his extension".

We could only to say that atomita's plugin should be mark as deprecated and add a note for using fluent-plugin-opensearch instead.

stevehipwell commented 2 years ago

@cosmo0920 I quoted the wrong reference doesn't fluent-plugin-aws-elasticsearch-service need to update to use opensearch-ruby?

RE supporting old AWS ElasticSearch versions, this only works if the ElasticSearch dependencies are locked to versions which haven't been broken. For this a patch release could be made first before updating it to use OpenSearch. Also I thought OpenSearch was backwards compatible?

cosmo0920 commented 2 years ago

Also I thought OpenSearch was backwards compatible?

No, it doesn't have backward compatibility. We should treat them as completely different products from Ruby clients.

cosmo0920 commented 2 years ago

@cosmo0920 I quoted the wrong reference doesn't fluent-plugin-aws-elasticsearch-service need to update to use opensearch-ruby?

I spent much times to send pings to @atomita but he rarely respond from my pings.

He did hate me to add RubyGems collaborater to release plugins. I cannot imagine that his plugin will be well maintained.

stevehipwell commented 2 years ago

@cosmo0920 could you clarify which versions of ElasticSearch, OpenSearch and AWS managed ElasticSearch/OpenSearch could be written to by a Fluentd instance with fluent-plugin-elasticsearch and fluent-plugin-opensearch installed?

I suspect any version of AWS managed ElasticSearch/OpenSearch not supported by fluent-plugin-opensearch won't work?

cosmo0920 commented 2 years ago

@cosmo0920 could you clarify which versions of ElasticSearch, OpenSearch and AWS managed ElasticSearch/OpenSearch could be written to by a Fluentd instance with fluent-plugin-elasticsearch and fluent-plugin-opensearch installed?

fluent-plugin-opensearch

fluent-plugin-aws-elasticsearch-service

But currently, plugin author does not respond and not be able to upgrade the latest fluent-plugin-elasticsearch as a dependency. That plugin is already abandoned work.

fluent-plugin-elasticsearch

stevehipwell commented 2 years ago

@cosmo0920 I don't think I got the question quite right; I'm specifically asking about being able to support ElasticSearch and all AWS OpenSearch/ElasticSearch versions from a single Fluentd instance. I don't think this is possible with the current version of fluent-plugin-aws-elasticsearch-service as it uses the same ElasticSearch components as fluent-plugin-elasticsearch (which only work if they're locked back anyway). Correct me if I'm wrong but I'm under the impression that Ruby can only support a single version of a gem?

cosmo0920 commented 2 years ago

Correct me if I'm wrong but I'm under the impression that Ruby can only support a single version of a gem?

If the gem does not use version lock for the dependency, users can specify arbitrary version of its dependent gem. For example, fluent-plugin-elasticsearch does not use version lock for elasticsearch-ruby gem. Users can use arbitrary elasticsearch-ruby gem. I'm not sure why you often ask what you are looking for. Ruby almost defines at runtime, you should create actual Ruby environment and testing and testing not frequently asking questions.

stevehipwell commented 2 years ago

If the gem does not use version lock for the dependency, users can specify arbitrary version of its dependent gem. For example, fluent-plugin-elasticsearch does not use version lock for elasticsearch-ruby gem. Users can use arbitrary elasticsearch-ruby gem. I'm not sure why you often ask what you are looking for. Ruby almost defines at runtime, you should create actual Ruby environment and testing and testing not frequently asking questions.

@cosmo0920 I'm asking for you to confirm that my assumption is right in how Ruby gems work, I know if there isn't a constraint the version depends on the user's choice but I don't think you can define the gem twice to get 2 different versions? If I'm correct it means that there is an issue with the current available plugins not being able to be used on a single Fluentd instance to send logs to ElasticSearch, AWS ElasticSearch and AWS OpenSearch. This is a pretty major issue that would need to be addressed by the fluent-plugin-aws-elasticsearch-service, but you closed my issue there and redirected me back here.

cosmo0920 commented 2 years ago

@cosmo0920 I'm asking for you to confirm that my assumption is right in how Ruby gems work, I know if there isn't a constraint the version depends on the user's choice but I don't think you can define the gem twice to get 2 different versions?

No, we can't. Ordinary Rubyist would use bundler to manage their using gem versions and Fluentd also can handle bundler machanism. So, it is not reasonable to specify dependent gem versions on Fluentd plugins.

cosmo0920 commented 2 years ago

If I'm correct it means that there is an issue with the current available plugins not being able to be used on a single Fluentd instance to send logs to ElasticSearch, AWS ElasticSearch and AWS OpenSearch.

Why do you want to use monolithic mechanism? We won't acceptable this mechanism.

In Ruby world, OpenSearch and Elasticsearch is completely different distributions like Debian GNU/Linux and Red Hat Enterprise Linux (RHEL). RHEL package is not usable on Debian GNU/Linux and vice versa. Theoretically, we cannot support both of full-text search engines in the one plugin. And we cannot accept OpenSearch and Elasticsearch plugins contaminated gem. It is not reasonable for users and packaging.

If you install fluent-plugin-elasticsearch and fluent-plugin-opensearch in your Fluentd environment, you can send Fluentd events into OpenSearch on premise, AWS OpenSearch, and Elasticsearch on premise with one Fluentd instance.