Closed alok87 closed 3 years ago
One issue found
I0703 13:36:46.282035 4794 redshift.go:1105] Missing column: new_col_name, add column will run.
I0703 13:36:46.282056 4794 redshift.go:1105] Missing column: text_length, add column will run.
ALTER TABLE ADD COLUMNS
is being constructed by the migration for the extra columns which are not present in source. This is because it only checks if the column needs to be added based on the index number. It should ignore extra columns instead.
Full table reload is required in this case. Also the full table reload should happen as reload group reload.
Solution should consider keeping tables with extra columns as first columns so that there is no migration problem due to new columns being added at source.
And if any new extra column is added in masking, it will any way recreate the table in reload group, which already works!
Masking has a feature which allows to add new extra columns in Redshift tables. These columns are only present in Redshift and not at source, as they are computed columns required for aggregate analysis on sensitive columns.
One such feature is
length_keys
Recently in one of such table using that feature we saw the new columns were added in source, the change could not be applied in Redshift. Following errors occured was seen in the loader logs:
(actual table names have been changed above)^ Conflict happens as the source has
new_col_name
coming before. But the target has an extra columntext_length
at that position.I remember I had solved for this use case https://github.com/practo/tipoca-stream/issues/126. But I see it has happened again. Need to relook and fix the problem.
Solution proposed
Need to have extra columns as first columns in the table instead of they being the last columns. When this starts happening we won't get order issues when new columns are added to a table at source. It would also work when a new extra column is added because that would be a recreate of table due to reload.