manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
8.78k stars 487 forks source link

About the failure of using logstash to synchronize mysql data to manticore #2279

Open zhangsanhuo opened 2 months ago

zhangsanhuo commented 2 months ago

Confirmation Checklist:

Your question:

Hello Manticore Team,

I hope this message finds you well.

I am currently working on a project where I need to sync data from MySQL to Manticore using Logstash. Despite following the documentation, I have encountered several difficulties and have been unable to achieve the desired outcome.

`computer system:  raspberry 4B`
`uname -a: Linux raspberrypi 5.15.0-1046-raspi #49-Ubuntu SMP PREEMPT Thu Jan 18 12:45:41 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux`
`lsb_release -a`: 
`Distributor ID:    Ubuntu
Description:    Ubuntu 22.04.4 LTS
Release:    22.04
Codename:   jammy`
`logstash version: logstash 8.13.4`
`manticore version: Server version: 6.2.12 dc5144d35@230822 (columnar 2.2.4 5aec342@230822) (secondary 2.2.4 5aec342@230822) git branch manticore-6.2.12...origin/manticore-6.2.12`
`mysql version: Server version: 10.6.16-MariaDB-0ubuntu0.22.04.1 Ubuntu 22.04`
  1. Logstash Configuration (logstash.conf)
    input {
      jdbc {
          type => "test_1"
          jdbc_driver_library => "/home/mysql-connector-j-8.0.32/mysql-connector-j-8.0.32.jar"
          jdbc_driver_class => "com.mysql.jdbc.Driver"
          jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
          jdbc_user => "root"
          jdbc_password => "root"
          jdbc_paging_enabled => true
          jdbc_page_size => "10"
          tracking_column => "unix_ts_in_secs"
          record_last_run => "true"
          last_run_metadata_path => "/home/logstash_jdbc_last_run"
          clean_run => "false"
          use_column_value => true
          tracking_column_type => "numeric"
          schedule => "*/5 * * * * *"
          statement => "SELECT *, UNIX_TIMESTAMP(up_date) AS unix_ts_in_secs FROM t1 WHERE (UNIX_TIMESTAMP(up_date) > :sql_last_value AND up_date < NOW()) ORDER BY up_date ASC"
      }
    }
    output {
      stdout { codec =>  "rubydebug"}
      if [type] == "test_1" {
        elasticsearch {
          index => "test_log"
          hosts => ["http://localhost:9308"]
            ilm_enabled => false
        manage_template => false
        }
      }
    }
  2. Manticore Configuration (manticore.conf):
    searchd
    {
            listen = 127.0.0.1:9312
            listen = 127.0.0.1:9306:mysql
            listen = 127.0.0.1:9308:http
            log = /var/log/manticore/searchd.log
            query_log = /var/log/manticore/query.log
            pid_file = /var/run/manticore/searchd.pid
            data_dir = /var/lib/manticore
    }
  3. mysql table:
    CREATE TABLE `t1` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(255) DEFAULT NULL,
      `up_date` datetime DEFAULT current_timestamp() ON UPDATE current_timestamp(),
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=28 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

Insert data: INSERT INTO test.t1 (id, name, up_date) VALUES (1, 'test zxf', '2024-06-05 12:47:50'); Issues Encountered

I have read the following documents multiple times:

Despite this, I encountered errors during the synchronization process.

When i start logstash, use "journalctl -xeu logstash" command, found this error below and no data flow to manticore:

[ERROR][logstash.outputs.elasticsearch][main][dbf1a120cbbadd189cd6a7257ecbc58db6fb85809d9a69ac4ca0e6b9ebea10e4] An unknown error occurred sending a bulk request to Elasticsearch (will retry indefinitely) {:message=>"no implicit conversion of String into Integer", :exception=>TypeError, :backtrace=>["org/jruby/RubyArray.java:1662:in `[]'", "/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-output-elasticsearch-11.22.6-java/lib/logstash/outputs/elasticsearch/http_client.rb:177:in `block in join_bulk_responses'", "org/jruby/RubyArray.java:4966:in `any?'", "/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-output-elasticsearch-11.22.6-java/lib/logstash/outputs/elasticsearch/http_client.rb:177:in `join_bulk_responses'", "/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-output-elasticsearch-11.22.6-java/lib/logstash/outputs/elasticsearch/http_client.rb:165:in `bulk'", "/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-output-elasticsearch-11.22.6-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:380:in `safe_bulk'", "/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-output-elasticsearch-11.22.6-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:252:in `submit'", "/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-output-elasticsearch-11.22.6-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:197:in `retrying_submit'", "/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/logstash-output-elasticsearch-11.22.6-java/lib/logstash/outputs/elasticsearch.rb:397:in `multi_receive'", "org/logstash/config/ir/compiler/AbstractOutputDelegatorExt.java:121:in `multi_receive'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:304:in `block in start_workers'"]}`

Could you please provide guidance or suggestions on how to successfully sync MySQL data to Manticore using Logstash? Specifically, I am interested in understanding:

Thank you very much for your assistance.

Best regards,

Kevin

tomatolog commented 2 months ago

could you try to use recent release? As string to timestamp conversion was fixed recently.

zhangsanhuo commented 2 months ago

could you try to use recent release? As string to timestamp conversion was fixed recently.

Thank you for your reply. I deleted old manticore, and reinstall manticore.

Now manticore version is: Server version: 6.3.0 1811a9efb@24052209 (columnar 2.3.0 88a01c3@24052206) (secondary 2.3.0 88a01c3@24052206) (knn 2.3.0 88a01c3@24052206) git branch HEAD (no branch)

But when i use logstash sycn mysql data to manticore, there is another errors below: Jun 06 21:02:51 raspberrypi logstash[194455]: [2024-06-06T21:02:51,145][ERROR][logstash.outputs.elasticsearch][main][34fdb55ea05929bf8e49916707c868bf53cef335ed6cce6f91ce5579a973a507] Encountered a retryable error (will retry with exponential backoff) {:code=>400, :url=>"http://localhost:9308/_bulk?filter_path=errors,items.*.error,items.*.status", :content_length=>188, :body=>"{\"error\":\"Something went wrong\"}"}

There is no tables when i use "show tables;" in manticore command.

Nick-S-2018 commented 2 months ago

It appears your issue has to do with the Logstash version you use. Unfortunately, it hasn't been reflected in our documentation that Manticore doesn't support the versions >7.15 as of now. We are going to add the support for the newer versions in the near future. Till that time, the solution for you would be to downgrade your Logstash to a version currently supported by Manticore (7.6-7.15), if that doesn't violate other parts of your work.

zhangsanhuo commented 2 months ago

Thanks for your reply. We tried two versions of logstash, 7.15.0 and 7.10.0, but unfortunately, we still could not successfully synchronize MySQL data to the RT table of manticore.

This is the error log of version 7.15.0:

2024-06-14T23:39:52,559][ERROR][logstash.outputs.elasticsearch][main] Could not connect to a compatible version of Elasticsearch {:url=>"http://localhost:9308/"}
[2024-06-14T23:39:52,701][ERROR][logstash.outputs.elasticsearch][main] Unable to retrieve Elasticsearch cluster uuid {:message=>"No Available connections", :exception=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError, :backtrace=>["/home/zxf/logstash-7.15.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-11.0.2-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:381:in `with_connection'", "/home/zxf/logstash-7.15.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-11.0.2-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:292:in `perform_request'", "/home/zxf/logstash-7.15.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-11.0.2-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:300:in `block in get'", "/home/zxf/logstash-7.15.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-11.0.2-java/lib/logstash/outputs/elasticsearch/http_client.rb:199:in `get'", "/home/zxf/logstash-7.15.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-11.0.2-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:152:in `discover_cluster_uuid'", "/home/zxf/logstash-7.15.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-11.0.2-java/lib/logstash/outputs/elasticsearch.rb:308:in `finish_register'", "/home/zxf/logstash-7.15.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-11.0.2-java/lib/logstash/outputs/elasticsearch.rb:279:in `block in register'", "/home/zxf/logstash-7.15.0/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-11.0.2-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:145:in `block in after_successful_connection'"]}
[2024-06-14T23:40:07,653][ERROR][logstash.outputs.elasticsearch][main][34fdb55ea05929bf8e49916707c868bf53cef335ed6cce6f91ce5579a973a507] Attempted to send a bulk request but there are no living connections in the pool (perhaps Elasticsearch is unreachable or down?) {:message=>"No Available connections", :exception=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError, :will_retry_in_seconds=>2}

This is the error log of version 7.10.0:

[2024-06-14T23:52:08,020][ERROR][logstash.outputs.elasticsearch][main][34fdb55ea05929bf8e49916707c868bf53cef335ed6cce6f91ce5579a973a507] Encountered a retryable error. Will Retry with exponential backoff  {:code=>409, :url=>"http://localhost:9308/_bulk"}
[2024-06-14T23:52:10,129][WARN ][logstash.outputs.elasticsearch][main][34fdb55ea05929bf8e49916707c868bf53cef335ed6cce6f91ce5579a973a507] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://localhost:9308/][Manticore::ClientProtocolException] localhost:9308 failed to respond {:url=>http://localhost:9308/, :error_message=>"Elasticsearch Unreachable: [http://localhost:9308/][Manticore::ClientProtocolException] localhost:9308 failed to respond", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
[2024-06-14T23:52:10,146][ERROR][logstash.outputs.elasticsearch][main][34fdb55ea05929bf8e49916707c868bf53cef335ed6cce6f91ce5579a973a507] Attempted to send a bulk request to elasticsearch' but Elasticsearch appears to be unreachable or down! {:error_message=>"Elasticsearch Unreachable: [http://localhost:9308/][Manticore::ClientProtocolException] localhost:9308 failed to respond", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :will_retry_in_seconds=>4}
sanikolaev commented 2 months ago

logstash 7.10.0 works with Manticore - https://play.manticoresearch.com/logstash/

sanikolaev commented 2 months ago

@zhangsanhuo try to understand how your case is different from https://play.manticoresearch.com/logstash/

zhangsanhuo commented 2 months ago

I found the reason. It was because of the Raspberry Pi system. When I used the "Linux new5 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux" system, there was no problem with logstash synchronization.

zhangsanhuo commented 2 months ago

Hello,

I found that when the mysql field content is NULL instead of Empty String, importing data to manticore through logstash will report an error: Attempted to send a bulk request but there are no living connections in the pool (perhaps Elasticsearch is unreachable or down?) {:message=>"No Available connections", :exception=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError, :will_retry_in_seconds=>2}

and it will cause the data to be unable to synchronize. I would like to ask if this problem will be fixed? Because there are a lot of Null data in our original database.

Thanks in advance for your response.

sanikolaev commented 2 months ago

Thanks @zhangsanhuo

We'll look into it and fix it if we can reproduce.

zhangsanhuo commented 1 month ago

Hello, I would like to ask, does manticore support search chain operation? Just like this: my_search[left_index:right_index].query(q1).query(q2).query(q3). Or can only BoolFilter be used for combined search?

tomatolog commented 1 month ago

you could use multi query only via SphinxQL interface as described Multi-queries. There is no support of multi query for HTTP REST interface.

You could create another ticket at Github with the feature request.

Or can only BoolFilter be used for combined search?

you could use bool JSON search property to combine multiple search conditions but it produces only single result set. Multi query allows to

Multi-queries, or query batches, allow you to send multiple search queries to Manticore in a single network request.

sanikolaev commented 1 month ago

does manticore support search chain operation? Just like this

The PHP client supports it somewhat - https://github.com/manticoresoftware/manticoresearch-php#perform-a-search

sanikolaev commented 1 month ago

@Nick-S-2018 pls try to reproduce it with NULL values.

Nick-S-2018 commented 1 month ago

I've reproduced the error and created another issue based on this one: https://github.com/manticoresoftware/manticoresearch/issues/2363

Nick-S-2018 commented 1 month ago

Blocked by #2363

zhangsanhuo commented 1 month ago

I've reproduced the error and created another issue based on this one: #2363

Could you please add support for manticore fields with bigint attributes and timestamp with Null value?

sanikolaev commented 1 week ago

Blocked by https://github.com/manticoresoftware/manticoresearch/issues/2363

Unblocked.