Open viki-crypto opened 3 years ago
There is no system to keep track of changing rows natively.
If you have a creation_date that is static on the invoice, you can use order_id and order_status as business_key. You can add a column current_date as last_seen_status (current_date being a postgresql function).
The would give you something like
order_id, order_status , last_seen_status, creation_date 1, ordered, 2020-06-16, 2020-06-15 1, accepted, 2020-06-18, 2020-06-15 1, processing, 2020-06-21, 2020-06-15
Using an ordering and the lag() over () function you can compute the first_seen a and last_seen. This would a be a different table (with a rebuild strategy).
order_id, order_status , first_seen, last_seen_status, creation_date 1, ordered, 2020-06-15, 2020-06-16, 2020-06-15 1, accepted, 2020-06-16, 2020-06-18, 2020-06-15 1, processing, 2020-06-18, 2020-06-21, 2020-06-15
This sounds interesting, Let me try this approach. Thanks
Also, is there any way to add indexes on the tables getting created from within the pipeline?
row_number() over () should do the trick
I am a little confused on usage of archive module with overwrite strategy. Please let me know if you can explain an approach for my use case.
I have an Orders table, which has over 1 million records. The primary key is ORDER_ID. The same table also keeps track of status of each order, with an order_status column - which can have values within a value range (ordered, accepted, processing, shipped, delivered, returned and a few more). This status is changed over a period of time sometime a month depending on order lifecycle.
How should I go about setting up an archive module for such case - e.g. what business keys should I chose - just the order_id or combination of columns. How does archive module work, I could understand for the new records but how does it gather if any record is updated in source db. thanks in advance.