apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.07k stars 1.83k forks source link

[Bug] [Oracle-CDC] Unable to get a partitioned table from OracleConnectionUtils#listTables #7964

Open michae-chu opened 3 weeks ago

michae-chu commented 3 weeks ago

Search before asking

What happened

We have a partitioned table which does not have table space name in ALL_TABLES. From the SQL SELECT OWNER ,TABLE_NAME,TABLESPACE_NAME FROM ALL_TABLES WHERE TABLESPACE_NAME IS NOT NULL AND TABLESPACE_NAME NOT IN ('SYSAUX') in OracleConnectionUtils#listTables, it will skip the table we want to capture and cause NullPointerException in other methods.

This table can be found in ALL_TAB_PARTITIONS which will have multiple records and table space name. Could you please advise if there is any config can fix my problem?

SeaTunnel Version

2.3.8

SeaTunnel Config

env {
  # You can set SeaTunnel environment configuration here
  parallelism = 1
  job.mode = "STREAMING"
  checkpoint.interval = 5000
}

source {
  Oracle-CDC {
    username = "username"
    password = "password"
    database-names = ["database"]
    schema-names = ["schema"]
    table-names = ["database.schame.ledger"]
    base-url = "XXX"
    startup.mode = "latest"
    source.reader.close.timeout = 120000
  }
}

sink {
  Console {
        parallelism = 1
  }

  # If you would like to get more information about how to configure SeaTunnel and see full list of sink plugins,
  # please go to https://seatunnel.apache.org/docs/category/sink-v2
}

Running Command

./bin/seatunnel.sh --config config/streaming.conf --async -n Ledger

Error Exception

903928267013095426] 2024-10-30 08:48:17,729 ERROR [o.a.s.e.s.d.p.PhysicalVertex  ] [hz.main.generic-operation.thread-8] - Job Ledger (903928267013095426), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-Oracle-CDC]-SplitEnumerator (1/1)] end with state FAILED and Exception: java.lang.NullPointerException
        at org.apache.seatunnel.connectors.cdc.base.source.enumerator.IncrementalSplitAssigner.createIncrementalSplit(IncrementalSplitAssigner.java:240)
        at org.apache.seatunnel.connectors.cdc.base.source.enumerator.IncrementalSplitAssigner.createIncrementalSplits(IncrementalSplitAssigner.java:197)
        at org.apache.seatunnel.connectors.cdc.base.source.enumerator.IncrementalSplitAssigner.getNext(IncrementalSplitAssigner.java:106)
        at org.apache.seatunnel.connectors.cdc.base.source.enumerator.IncrementalSourceEnumerator.assignSplits(IncrementalSourceEnumerator.java:172)
        at org.apache.seatunnel.connectors.cdc.base.source.enumerator.IncrementalSourceEnumerator.run(IncrementalSourceEnumerator.java:70)
        at org.apache.seatunnel.engine.server.task.SourceSplitEnumeratorTask.stateProcess(SourceSplitEnumeratorTask.java:323)
        at org.apache.seatunnel.engine.server.task.SourceSplitEnumeratorTask.call(SourceSplitEnumeratorTask.java:141)
        at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:693)
        at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1018)
        at org.apache.seatunnel.api.tracing.MDCRunnable.run(MDCRunnable.java:39)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

Zeta or Flink or Spark Version

No response

Java or Scala Version

JDK11

Screenshots

image

Are you willing to submit PR?

Code of Conduct