GoogleCloudPlatform / professional-services-data-validator

Utility to compare data between homogeneous or heterogeneous environments to ensure source and target tables match
Apache License 2.0
399 stars 114 forks source link

fix: Apply RTRIM on string column when generating partitions with `-tsp` #1182

Closed nehanene15 closed 3 months ago

nehanene15 commented 3 months ago

Closes Issue #1179

Applies RTRIM on string PKs when generating partitions with the --trim-string-pks flag. With this, the YAML configs will have trimmed whitespace from the string value in the WHERE clause. i.e

  filters:
  - source: ' ( "name" < ''Bob'' ) OR ( ( "name" = ''Bob'' ) AND ( "id" < 2 ) )'
    target: ' ( `name` < ''Bob'' ) OR ( ( `name` = ''Bob'' ) AND ( `id` < 2 ) )'
nehanene15 commented 3 months ago

/gcbrun

nehanene15 commented 3 months ago

Below are the results of first_keys_table.compile() here. This change only adds and RTRIM to the string PK columns

WITH t0 AS (
  SELECT rtrim(t2."name") AS "name", t2."id",
         (row_number() OVER (ORDER BY t2."name" ASC, t2."id" ASC) - 1) + 1 AS "dvt_pos_num"
  FROM udfs.pk_bug t2
)
SELECT t1.*
FROM (
  SELECT t0.*
  FROM t0
  WHERE (((t0."dvt_pos_num" * 2) - (floor((t0."dvt_pos_num" * 2) / 3) * 3)) <= 2) AND
        (((t0."dvt_pos_num" * 2) - (floor((t0."dvt_pos_num" * 2) / 3) * 3)) > 0)
) t1
ORDER BY t1."name" ASC, t1."id" ASC

With -tsp:

WITH t0 AS (
  SELECT t2."name", t2."id",
         (row_number() OVER (ORDER BY t2."name" ASC, t2."id" ASC) - 1) + 1 AS "dvt_pos_num"
  FROM udfs.pk_bug t2
)
SELECT t1.*
FROM (
  SELECT t0.*
  FROM t0
  WHERE (((t0."dvt_pos_num" * 2) - (floor((t0."dvt_pos_num" * 2) / 3) * 3)) <= 2) AND
        (((t0."dvt_pos_num" * 2) - (floor((t0."dvt_pos_num" * 2) / 3) * 3)) > 0)
) t1
ORDER BY t1."name" ASC, t1."id" ASC