gluent / goe

GOE: a simple and flexible way to copy data from an Oracle Database to Google BigQuery.
Apache License 2.0
8 stars 2 forks source link

Performance enhancement for PREDICATE_AND_RANGE offloads #166

Open nj1973 opened 2 months ago

nj1973 commented 2 months ago

We need a better Offload Transport splitter for PREDICATE_AND_RANGE offloads. Currently we select ROWID range splitting and the ROWID ranges generated are for the whole table. The predicate is likely for a very small portion of the table.

When using PREDICATE_AND_RANGE we mandate that the predicate contains the partition key, therefore we should be able to apply that to the list of not-yet-Offloaded partitions and identify which partitions satisfy the data for the predicate. Doing this would make the feature much more performant.

For non-partitioned tables with a numeric primary key singleton we could utilise id range splitting like we do for IOTs.