apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.82k stars 1.76k forks source link

[Bug] Error in synchronizing es data #4240

Closed yang227 closed 1 year ago

yang227 commented 1 year ago

Search before asking

What happened

Error in synchronizing es data

SeaTunnel Version

2.3.0

SeaTunnel Config

env {
  execution.parallelism = 1
  job.mode = "BATCH"
}

source {
Elasticsearch {
    hosts = ["****:9200"]
    index = "skywalking_segment-20230222"
    source = ["trace_id","service_name",  "endpoint_name", "latency","end_time","endpoint_id", "service_instance_id" , "version", "start_time", "data_binary","service_id","statement","time_bucket","is_error","segment_id"  ]
}
}

transform {

}

sink {
    Doris {
        nodeUrls = ["****:8030"]
        username = ***
        password = "***"
        database = "test"
        table = "es_skywalking_segment_test"
        batch_max_rows = 100
        sink.properties.format = "JSON"
        sink.properties.strip_outer_array = true
  }
}

Running Command

/data/devops/seatunnel/bin/seatunnel.sh --config $1 -e local

Error Exception

2023-02-28 17:56:58,536 INFO  org.apache.seatunnel.plugin.discovery.AbstractPluginDiscovery - Load SeaTunnelSource Plugin from /data/devops/seatunnel/connectors/seatunnel
2023-02-28 17:56:58,543 INFO  org.apache.seatunnel.plugin.discovery.AbstractPluginDiscovery - Discovery plugin jar: Elasticsearch at: file:/data/devops/seatunnel/connectors/seatunnel/connector-elasticsearch-2.3.0.jar
2023-02-28 17:56:58,546 INFO  org.apache.seatunnel.plugin.discovery.AbstractPluginDiscovery - Load plugin: PluginIdentifier{engineType='seatunnel', pluginType='source', pluginName='Elasticsearch'} from path: file:/data/devops/seatunnel/connectors/seatunnel/connector-elasticsearch-2.3.0.jar use classloader: sun.misc.Launcher$AppClassLoader
2023-02-28 17:56:58,793 INFO  org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient - GET skywalking_segment-20230222/_mappings respnse={"skywalking_segment-20230222":{"mappings":{"properties":{"data_binary":{"type":"binary"},"end_time":{"type":"long"},"endpoint_id":{"type":"keyword"},"endpoint_name":{"type":"keyword","copy_to":["endpoint_name_match"]},"endpoint_name_match":{"type":"text","analyzer":"oap_analyzer"},"is_error":{"type":"integer"},"latency":{"type":"integer"},"segment_id":{"type":"keyword"},"service_id":{"type":"keyword"},"service_instance_id":{"type":"keyword"},"service_name":{"type":"keyword","copy_to":["service_name_match"]},"service_name_match":{"type":"text","analyzer":"oap_analyzer"},"start_time":{"type":"long"},"statement":{"type":"keyword"},"time_bucket":{"type":"long"},"trace_id":{"type":"keyword"},"version":{"type":"integer","index":false}}}}}
2023-02-28 17:56:58,804 INFO  com.hazelcast.core.LifecycleService - hz.client_1 [seatunnel_default_cluster-893969] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is SHUTTING_DOWN
2023-02-28 17:56:58,805 INFO  com.hazelcast.internal.server.tcp.TcpServerConnection - [localhost]:5801 [seatunnel_default_cluster-893969] [5.1] Connection[id=1, /127.0.0.1:5801->/127.0.0.1:35002,qualifier=null, endpoint=[127.0.0.1]:35002, remoteUuid=287a76c0-78c7-4a75-9e5e-dbf76f7a20f7, alive=false, connectionType=JVM, planeIndex=-1] closed. Reason: Connection closed by the other side
2023-02-28 17:56:58,806 INFO  com.hazelcast.client.impl.connection.ClientConnectionManager - hz.client_1 [seatunnel_default_cluster-893969] [5.1] Removed connection to endpoint: [localhost]:5801:eda27049-8145-4038-adbe-0f1841b0fa7c, connection: ClientConnection{alive=false, connectionId=1, channel=NioChannel{/127.0.0.1:35002->localhost/127.0.0.1:5801}, remoteAddress=[localhost]:5801, lastReadTime=2023-02-28 17:56:58.496, lastWriteTime=2023-02-28 17:56:58.494, closedTime=2023-02-28 17:56:58.804, connected server version=5.1}
2023-02-28 17:56:58,806 INFO  com.hazelcast.core.LifecycleService - hz.client_1 [seatunnel_default_cluster-893969] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is CLIENT_DISCONNECTED
2023-02-28 17:56:58,807 INFO  com.hazelcast.client.impl.ClientEndpointManager - [localhost]:5801 [seatunnel_default_cluster-893969] [5.1] Destroying ClientEndpoint{connection=Connection[id=1, /127.0.0.1:5801->/127.0.0.1:35002, qualifier=null, endpoint=[127.0.0.1]:35002, remoteUuid=287a76c0-78c7-4a75-9e5e-dbf76f7a20f7, alive=false, connectionType=JVM, planeIndex=-1], clientUuid=287a76c0-78c7-4a75-9e5e-dbf76f7a20f7, clientName=hz.client_1, authenticated=true, clientVersion=5.1, creationTime=1677578218423, latest clientAttributes=lastStatisticsCollectionTime=1677578218447,enterprise=false,clientType=JVM,clientVersion=5.1,clusterConnectionTimestamp=1677578218417,clientAddress=127.0.0.1,clientName=hz.client_1,credentials.principal=null,os.committedVirtualMemorySize=25849192448,os.freePhysicalMemorySize=35098791936,os.freeSwapSpaceSize=0,os.maxFileDescriptorCount=65536,os.openFileDescriptorCount=45,os.processCpuTime=5290000000,os.systemLoadAverage=0.42,os.totalPhysicalMemorySize=66310848512,os.totalSwapSpaceSize=0,runtime.availableProcessors=16,runtime.freeMemory=1078165344,runtime.maxMemory=14736162816,runtime.totalMemory=1114636288,runtime.uptime=1338,runtime.usedMemory=36470944, labels=[]}
2023-02-28 17:56:58,807 INFO  com.hazelcast.core.LifecycleService - hz.client_1 [seatunnel_default_cluster-893969] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is SHUTDOWN
2023-02-28 17:56:58,807 INFO  com.hazelcast.core.LifecycleService - [localhost]:5801 [seatunnel_default_cluster-893969] [5.1] [localhost]:5801 is SHUTTING_DOWN
2023-02-28 17:56:58,810 INFO  com.hazelcast.instance.impl.Node - [localhost]:5801 [seatunnel_default_cluster-893969] [5.1] Shutting down connection manager...
2023-02-28 17:56:58,811 INFO  com.hazelcast.instance.impl.Node - [localhost]:5801 [seatunnel_default_cluster-893969] [5.1] Shutting down node engine...
2023-02-28 17:56:58,813 INFO  org.apache.seatunnel.engine.server.SeaTunnelServer - master node check interrupted
2023-02-28 17:57:01,816 INFO  com.hazelcast.instance.impl.NodeExtension - [localhost]:5801 [seatunnel_default_cluster-893969] [5.1] Destroying node NodeExtension.
2023-02-28 17:57:01,817 INFO  com.hazelcast.instance.impl.Node - [localhost]:5801 [seatunnel_default_cluster-893969] [5.1] Hazelcast Shutdown is completed in 3008 ms.
2023-02-28 17:57:01,817 INFO  com.hazelcast.core.LifecycleService - [localhost]:5801 [seatunnel_default_cluster-893969] [5.1] [localhost]:5801 is SHUTDOWN
2023-02-28 17:57:01,817 ERROR org.apache.seatunnel.core.starter.Seatunnel -

===============================================================================

2023-02-28 17:57:01,817 ERROR org.apache.seatunnel.core.starter.Seatunnel - Fatal Error,

2023-02-28 17:57:01,817 ERROR org.apache.seatunnel.core.starter.Seatunnel - Please submit bug report in https://github.com/apache/incubator-seatunnel/issues

2023-02-28 17:57:01,817 ERROR org.apache.seatunnel.core.starter.Seatunnel - Reason:null

2023-02-28 17:57:01,818 ERROR org.apache.seatunnel.core.starter.Seatunnel - Exception StackTrace:java.lang.NullPointerException
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.getFieldTypeMappingFromProperties(EsRestClient.java:323)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.getFieldTypeMapping(EsRestClient.java:309)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSource.prepare(ElasticsearchSource.java:66)
        at org.apache.seatunnel.engine.core.parse.ConnectorInstanceLoader.loadSourceInstance(ConnectorInstanceLoader.java:60)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.sampleAnalyze(JobConfigParser.java:314)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parse(JobConfigParser.java:125)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getLogicalDag(JobExecutionEnvironment.java:129)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:121)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:91)
        at org.apache.seatunnel.core.starter.Seatunnel.run(Seatunnel.java:39)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:31)

2023-02-28 17:57:01,819 ERROR org.apache.seatunnel.core.starter.Seatunnel -
===============================================================================

Exception in thread "main" java.lang.NullPointerException
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.getFieldTypeMappingFromProperties(EsRestClient.java:323)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.client.EsRestClient.getFieldTypeMapping(EsRestClient.java:309)
        at org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSource.prepare(ElasticsearchSource.java:66)
        at org.apache.seatunnel.engine.core.parse.ConnectorInstanceLoader.loadSourceInstance(ConnectorInstanceLoader.java:60)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.sampleAnalyze(JobConfigParser.java:314)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parse(JobConfigParser.java:125)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getLogicalDag(JobExecutionEnvironment.java:129)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:121)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:91)
        at org.apache.seatunnel.core.starter.Seatunnel.run(Seatunnel.java:39)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:31)

Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

yang227 commented 1 year ago

集群版本信息 { "name" : "mgr_node09",

"version" : { "number" : "7.8.0", "build_flavor" : "default", "build_type" : "rpm",

"build_snapshot" : false,
"lucene_version" : "8.5.1",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"

}, "tagline" : "You Know, for Search" }

索引配置:

{ "skywalking_segment-20230228" : { "settings" : { "index" : { "lifecycle" : { "name" : "watch-history-ilm-policy" }, "refresh_interval" : "10s", "number_of_shards" : "15", "provided_name" : "skywalking_segment-20230228", "creation_date" : "1677513600608", "analysis" : { "analyzer" : { "oap_analyzer" : { "type" : "stop" } } }, "number_of_replicas" : "1", "uuid" : "JaNBVKcdT5-Bi8OD3opPOA", "version" : { "created" : "7080099" } } } } }

mapping

{ "skywalking_segment-20230228" : { "mappings" : { "properties" : { "data_binary" : { "type" : "binary" }, "end_time" : { "type" : "long" }, "endpoint_id" : { "type" : "keyword" }, "endpoint_name" : { "type" : "keyword", "copy_to" : [ "endpoint_name_match" ] }, "endpoint_name_match" : { "type" : "text", "analyzer" : "oap_analyzer" }, "is_error" : { "type" : "integer" }, "latency" : { "type" : "integer" }, "segment_id" : { "type" : "keyword" }, "service_id" : { "type" : "keyword" }, "service_instance_id" : { "type" : "keyword" }, "service_name" : { "type" : "keyword", "copy_to" : [ "service_name_match" ] }, "service_name_match" : { "type" : "text", "analyzer" : "oap_analyzer" }, "start_time" : { "type" : "long" }, "statement" : { "type" : "keyword" }, "time_bucket" : { "type" : "long" }, "trace_id" : { "type" : "keyword" }, "version" : { "type" : "integer", "index" : false } } } } }

iture123 commented 1 year ago

I try to fix