gluent / goe

GOE: a simple and flexible way to copy data from an Oracle Database to Google BigQuery.
Apache License 2.0
8 stars 2 forks source link

Offload Transport SCN improvement #61

Open nj1973 opened 9 months ago

nj1973 commented 9 months ago

There are two things to think about here:

  1. The Oracle SCN is used twice in Offload but is inconsistent
  2. We need to think about how we support external tools

Inconsistent SCN Pre-offload SCN (OFFLOAD_SNAPSHOT) is not aligned with the SCN used in Offload Transport, we need to sync these up.

I think the SCN picked by Offload transport should be stored in state and exposed in the object returned by offload_transport_factory. This should then be used for metadata if required.

It is important to align the data extraction and offload metadata before any CDC tool can be integrated.

External tool support We should add an option to allow an Offload snapshot value (SCN) to be passed into an Offload. This should then be stored in metadata and used for Offload Transport queries. It is only stored in metadata for the initial offload but different values can be passed in for incremental offloads as long as they are higher/more recent than the initial snapshot value.

This allows external tools to request an SCN of Offload, I'm not sure how we cater for the opposite direction. Perhaps we don't for now. At least the value is in metadata if we want to add a utility tool later.