gluent / goe

GOE: a simple and flexible way to copy data from an Oracle Database to Google BigQuery.
Apache License 2.0
8 stars 2 forks source link

Implement sub-chunking for offloads #192

Open nj1973 opened 1 month ago

nj1973 commented 1 month ago

Sometimes we encounter very large non-partitioned tables or very large single partitions, e.g. > 10TB.

At the moment the segment (i.e. top level table or partition) is the smallest level of chunking we support. If we cannot offload that smallest unit of work in a single pass (e.g. ORA-1555) then our only option is to keep increasing parallelism in the hope of completing before we run out of time.

It has been suggested that we should attempt to break the single segment, be it a table or partition, down into multiple transport jobs.

Example 1:

Example 2: