manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
8.9k stars 493 forks source link

When MySQL data is null instead of empty string, data cannot be synchronized to Manticore's RT table via logstash #2380

Open zhangsanhuo opened 2 months ago

zhangsanhuo commented 2 months ago

Bug Description:

manticore version:

Manticore 6.3.2 c296dc7c8@24062606 (columnar 2.3.0 88a01c3@24052206) (secondary 2.3.0 88a01c3@24052206) (knn 2.3.0 88a01c3@24052206)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2024, Manticore Software LTD (https://manticoresearch.com)

mariadb: Server version: 10.6.12-MariaDB-1:10.6.12+maria~deb11 mariadb.org binary distribution logstash logstash-7.14.0

my test data is as follows, create a RT table:

drop table t1;CREATE TABLE t1 (
id bigint,
up_date text,
name text,
@version text,
@timestamp timestamp,
unix_ts_in_secs integer,
number bigint,
pub_date timestamp
) charset_table='non_cjk,cjk' morphology='icu_chinese,stem_en';

logstash synchronization file:

input {
  jdbc {
    jdbc_driver_library => "/root/mysql-connector-j-8.0.32/mysql-connector-j-8.0.32.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
    jdbc_user => "root"
    jdbc_password => "root"
    jdbc_paging_enabled => true
    jdbc_page_size => "10"
    tracking_column => "unix_ts_in_secs"
    #record_last_run => "true"
    record_last_run => "false"
    last_run_metadata_path => "/root/logstash_jdbc_last_run"
    clean_run => "false"
    use_column_value => true
    tracking_column_type => "numeric"
    schedule => "*/5 * * * * *"
    statement => "SELECT *, UNIX_TIMESTAMP(up_date) AS unix_ts_in_secs FROM t1 WHERE (UNIX_TIMESTAMP(up_date) > :sql_last_value AND up_date < NOW()) ORDER BY up_date ASC"
  }
}

output {
  stdout { codec =>  "rubydebug"}
    elasticsearch {
      index => "t1"
      hosts => ["http://localhost:9308"]
    }
}

MySQL table structure

CREATE TABLE `t1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT '',
  `number` varchar(11) DEFAULT NULL,
  `pub_date` datetime DEFAULT current_timestamp(),
  `up_date` datetime DEFAULT current_timestamp() ON UPDATE current_timestamp(),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=14 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

and Mysql data:

INSERT INTO `test`.`t1` (`id`, `name`, `number`, `pub_date`, `up_date`) VALUES (1, 'kevin1', '1000001', NULL, '2024-07-03 16:47:48');
INSERT INTO `test`.`t1` (`id`, `name`, `number`, `pub_date`, `up_date`) VALUES (2, 'kevin2', NULL, '2024-07-03 16:51:11', '2024-07-03 16:53:18');

I created two examples, one with number(bigint in RT table) as Null value and the other with pub_date(timestamp in RT table) as Null value.

When using logstash to synchronize data to manticore, an error is reported:

[2024-07-03T08:47:50,171][ERROR][logstash.outputs.elasticsearch][main][30a15ecf5e6cbe42f11b334f8da41fd6cfa5267d46f7b11973efecaf11a13318] Encountered a retryable error (will retry with exponential backoff) {:code=>409, :url=>"http://localhost:9308/_bulk", :content_length=>233} 
[2024-07-03T08:53:34,394][ERROR][logstash.outputs.elasticsearch][main][30a15ecf5e6cbe42f11b334f8da41fd6cfa5267d46f7b11973efecaf11a13318] Encountered a retryable error (will retry with exponential backoff) {:code=>409, :url=>"http://localhost:9308/_bulk", :content_length=>250} 

Data synchronization cannot be performed.

Manticore Search Version:

Manticore 6.3.2 c296dc7c8@24062606 (columnar 2.3.0 88a01c3@24052206) (secondary 2.3.0 88a01c3@24052206) (knn 2.3.0 88a01c3@24052206)

Operating System Version:

Debian 12

Have you tried the latest development version?

No

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

- [ ] Implementation completed - [ ] Tests developed - [ ] Documentation updated - [ ] Documentation reviewed - [ ] Changelog updated
Nick-S-2018 commented 1 month ago

Can you try to repeat your test with Manticore's latest dev version? The issue should be fixed now.