apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.81k stars 1.75k forks source link

[Bug] [Module Name] Bug title #7107

Open kidloator opened 2 months ago

kidloator commented 2 months ago

Search before asking

What happened

2024-07-04 18:33:15,748 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Received JobGraph submission 'SeaTunnel' (e5287c00d1bf482307307cf71e17448f). 2024-07-04 18:33:15,748 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Submitting job 'SeaTunnel' (e5287c00d1bf482307307cf71e17448f). 2024-07-04 18:33:15,749 INFO org.apache.flink.runtime.jobmaster.JobMasterServiceLeadershipRunner [] - JobMasterServiceLeadershipRunner for job e5287c00d1bf482307307cf71e17448f was granted leadership with leader id 00000000-0000-0000-0000-000000000000. Creating new JobMasterServiceProcess. 2024-07-04 18:33:15,750 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for org.apache.flink.runtime.jobmaster.JobMaster at akka://flink/user/rpc/jobmanager_9 . 2024-07-04 18:33:15,750 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Initializing job 'SeaTunnel' (e5287c00d1bf482307307cf71e17448f). 2024-07-04 18:33:15,750 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Using restart back off time strategy FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=2147483647, backoffTimeMS=1000) for SeaTunnel (e5287c00d1bf482307307cf71e17448f). 2024-07-04 18:33:15,751 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Created execution graph f2d3119a4c76370dbe3ddf33e19a99b4 for job e5287c00d1bf482307307cf71e17448f. 2024-07-04 18:33:15,752 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Running initialization on master for job SeaTunnel (e5287c00d1bf482307307cf71e17448f). 2024-07-04 18:33:15,752 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Successfully ran initialization on master in 0 ms. 2024-07-04 18:33:15,905 INFO org.apache.flink.runtime.scheduler.adapter.DefaultExecutionTopology [] - Built 1 new pipelined regions in 0 ms, total 1 pipelined regions currently. 2024-07-04 18:33:15,905 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - No state backend has been configured, using default (HashMap) org.apache.flink.runtime.state.hashmap.HashMapStateBackend@652d1f95 2024-07-04 18:33:15,905 INFO org.apache.flink.runtime.state.StateBackendLoader [] - State backend loader loads the state backend as HashMapStateBackend 2024-07-04 18:33:15,905 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Checkpoint storage is set to 'jobmanager' 2024-07-04 18:33:15,905 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - No checkpoint found during restore. 2024-07-04 18:33:15,905 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Using failover strategy org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy@664e5178 for SeaTunnel (e5287c00d1bf482307307cf71e17448f). 2024-07-04 18:33:15,907 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Starting execution of job 'SeaTunnel' (e5287c00d1bf482307307cf71e17448f) under job master id 00000000000000000000000000000000. 2024-07-04 18:33:15,908 INFO org.apache.flink.runtime.source.coordinator.SourceCoordinator [] - Starting split enumerator for source Source: MongoDB-CDC-source. 2024-07-04 18:33:15,924 INFO org.apache.seatunnel.connectors.seatunnel.cdc.mongodb.internal.MongodbClientProvider [] - Create and register mongo client bguser@[10.5.32.17:27017] 2024-07-04 18:33:16,005 INFO org.mongodb.driver.client [] - MongoClient with metadata {"driver": {"name": "mongo-java-driver|sync", "version": "4.7.1"}, "os": {"type": "Linux", "name": "Linux", "architecture": "amd64", "version": "4.18.0-305.3.1.el8.x86_64"}, "platform": "Java/Red Hat, Inc./1.8.0_312-b07"} created with settings MongoClientSettings{readPreference=primary, writeConcern=WriteConcern{w=null, wTimeout=null ms, journal=null}, retryWrites=true, retryReads=true, readConcern=ReadConcern{level=null}, credential=MongoCredential{mechanism=null, userName='bguser', source='admin', password=, mechanismProperties=}, streamFactoryFactory=null, commandListeners=[], codecRegistry=ProvidersCodecRegistry{codecProviders=[ValueCodecProvider{}, BsonValueCodecProvider{}, DBRefCodecProvider{}, DBObjectCodecProvider{}, DocumentCodecProvider{}, IterableCodecProvider{}, MapCodecProvider{}, GeoJsonCodecProvider{}, GridFSFileCodecProvider{}, Jsr310CodecProvider{}, JsonObjectCodecProvider{}, BsonCodecProvider{}, EnumCodecProvider{}, com.mongodb.Jep395RecordCodecProvider@1add2356]}, clusterSettings={hosts=[10.5.32.17:27017], srvServiceName=mongodb, mode=SINGLE, requiredClusterType=UNKNOWN, requiredReplicaSetName='null', serverSelector='null', clusterListeners='[]', serverSelectionTimeout='30000 ms', localThreshold='30000 ms'}, socketSettings=SocketSettings{connectTimeoutMS=10000, readTimeoutMS=0, receiveBufferSize=0, sendBufferSize=0}, heartbeatSocketSettings=SocketSettings{connectTimeoutMS=10000, readTimeoutMS=10000, receiveBufferSize=0, sendBufferSize=0}, connectionPoolSettings=ConnectionPoolSettings{maxSize=100, minSize=0, maxWaitTimeMS=120000, maxConnectionLifeTimeMS=0, maxConnectionIdleTimeMS=0, maintenanceInitialDelayMS=0, maintenanceFrequencyMS=60000, connectionPoolListeners=[], maxConnecting=2}, serverSettings=ServerSettings{heartbeatFrequencyMS=10000, minHeartbeatFrequencyMS=500, serverListeners='[]', serverMonitorListeners='[]'}, sslSettings=SslSettings{enabled=false, invalidHostNameAllowed=false, context=null}, applicationName='null', compressorList=[], uuidRepresentation=UNSPECIFIED, serverApi=null, autoEncryptionSettings=null, contextProvider=null} 2024-07-04 18:33:16,012 INFO org.mongodb.driver.connection [] - Opened connection [connectionId{localValue:1, serverValue:44874735}] to 10.5.32.17:27017 2024-07-04 18:33:16,013 INFO org.mongodb.driver.cluster [] - Monitor thread successfully connected to server with description ServerDescription{address=10.5.32.17:27017, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=8, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=19795927, setName='cmgo-10s0ahyb_0', canonicalAddress=10.5.32.17:27017, hosts=[10.5.32.15:27017, 10.5.32.17:27017], passives=[], arbiters=[], primary='10.5.32.17:27017', tagSet=TagSet{[]}, electionId=7fffffff0000000000000001, setVersion=4, topologyVersion=null, lastWriteDate=Thu Jul 04 18:33:10 CST 2024, lastUpdateTimeNanos=30934363414716361} 2024-07-04 18:33:16,015 INFO org.mongodb.driver.connection [] - Opened connection [connectionId{localValue:2, serverValue:44874736}] to 10.5.32.17:27017 2024-07-04 18:33:16,068 INFO org.mongodb.driver.connection [] - Opened connection [connectionId{localValue:3, serverValue:44874737}] to 10.5.32.17:27017 2024-07-04 18:33:16,092 INFO org.apache.seatunnel.connectors.cdc.base.source.enumerator.SnapshotSplitAssigner [] - SnapshotSplitAssigner created with remaining tables: [test.user] 2024-07-04 18:33:16,093 INFO org.apache.seatunnel.connectors.cdc.base.source.enumerator.SnapshotSplitAssigner [] - SnapshotSplitAssigner created with remaining splits: [] 2024-07-04 18:33:16,093 INFO org.apache.seatunnel.connectors.cdc.base.source.enumerator.SnapshotSplitAssigner [] - SnapshotSplitAssigner created with assigned splits: [] 2024-07-04 18:33:16,094 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Starting scheduling with scheduling strategy [org.apache.flink.runtime.scheduler.strategy.PipelinedRegionSchedulingStrategy] 2024-07-04 18:33:16,094 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job SeaTunnel (e5287c00d1bf482307307cf71e17448f) switched from state CREATED to RUNNING. 2024-07-04 18:33:16,094 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: MongoDB-CDC-source -> anonymous_datastream_source$1[1] -> TableToDataSteam -> Kafka: Writer -> Kafka: Committer (1/1) (f2d3119a4c76370dbe3ddf33e19a99b4_cbc357ccb763df2852fee8c4fc7d55f2_00) switched from CREATED to SCHEDULED. 2024-07-04 18:33:16,095 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Connecting to ResourceManager akka.tcp://flink@localhost:6123/user/rpc/resourcemanager(00000000000000000000000000000000) 2024-07-04 18:33:16,095 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Resolved ResourceManager address, beginning registration 2024-07-04 18:33:16,095 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Registering job manager 00000000000000000000000000000000@akka.tcp://flink@localhost:6123/user/rpc/jobmanager_9 for job e5287c00d1bf482307307cf71e17448f. 2024-07-04 18:33:16,096 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Registered job manager 00000000000000000000000000000000@akka.tcp://flink@localhost:6123/user/rpc/jobmanager_9 for job e5287c00d1bf482307307cf71e17448f. 2024-07-04 18:33:16,096 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000. 2024-07-04 18:33:16,096 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Received resource requirements from job e5287c00d1bf482307307cf71e17448f: [ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, numberOfRequiredSlots=1}] 2024-07-04 18:33:16,176 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: MongoDB-CDC-source -> anonymous_datastream_source$1[1] -> TableToDataSteam -> Kafka: Writer -> Kafka: Committer (1/1) (f2d3119a4c76370dbe3ddf33e19a99b4_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from SCHEDULED to DEPLOYING. 2024-07-04 18:33:16,177 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Deploying Source: MongoDB-CDC-source -> anonymous_datastream_source$1[1] -> TableToDataSteam -> Kafka: Writer -> Kafka: Committer (1/1) (attempt #0) with attempt id f2d3119a4c76370dbe3ddf33e19a99b4_cbc357ccb763df2852fee8c4fc7d55f2_0_0 and vertex id cbc357ccb763df2852fee8c4fc7d55f2_0 to localhost:38233-aab570 @ k8s01 (dataPort=37077) with allocation id 3ffef464d417c11d026a7af6d62ebd68 2024-07-04 18:33:16,654 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: MongoDB-CDC-source -> anonymous_datastream_source$1[1] -> TableToDataSteam -> Kafka: Writer -> Kafka: Committer (1/1) (f2d3119a4c76370dbe3ddf33e19a99b4_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to INITIALIZING. 2024-07-04 18:33:16,899 INFO org.apache.flink.runtime.source.coordinator.SourceCoordinator [] - Source Source: MongoDB-CDC-source registering reader for parallel task 0 (#0) @ localhost 2024-07-04 18:33:16,900 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: MongoDB-CDC-source -> anonymous_datastream_source$1*[1] -> TableToDataSteam -> Kafka: Writer -> Kafka: Committer (1/1) (f2d3119a4c76370dbe3ddf33e19a99b4_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from INITIALIZING to RUNNING. 2024-07-04 18:33:16,900 INFO org.apache.flink.runtime.source.coordinator.SourceCoordinator [] - Source Source: MongoDB-CDC-source received split request from parallel task 0 (#0) 2024-07-04 18:37:40,041 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 1 (type=CheckpointType{name='Checkpoint', sharingFilesStrategy=FORWARD_BACKWARD}) @ 1720089460040 for job e5287c00d1bf482307307cf71e17448f. 2024-07-04 18:37:40,077 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed checkpoint 1 for job e5287c00d1bf482307307cf71e17448f (6185 bytes, checkpointDuration=36 ms, finalizationTime=1 ms). 2024-07-04 18:37:40,078 INFO org.apache.flink.runtime.source.coordinator.SourceCoordinator [] - Marking checkpoint 1 as completed for source Source: MongoDB-CDC-source.

SeaTunnel Version

2.3.5

SeaTunnel Config

MongoDB-CDC {
    hosts = "x.x.x.x:27017"
    database = ["a"]
    collection = ["a.b"]
    username = username
    password = pwd
    result_table_name = "fake"
    startup_mode = "earliest"
    schema = {
      fields {
        "_id" : string,
        "age" : string,
        "name" : string,

      }
    }
  }

Running Command

./bin/start-seatunnel-flink-15-connector-v2.sh --config config/example.conf

Error Exception

nothing

Zeta or Flink or Spark Version

flink:1.16.3

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.