GoogleCloudPlatform / professional-services-data-validator

Utility to compare data between homogeneous or heterogeneous environments to ensure source and target tables match
Apache License 2.0
404 stars 117 forks source link

validate row: Hash Validation Fails on Oracle Versions < 12c #775

Open nj1973 opened 1 year ago

nj1973 commented 1 year ago

When using -hash the generated SQL uses a STANDARD_HASH() SQL function. Test table:

create table dms_test_cases.tab_id (id number(2) not null primary key);
insert into dms_test_cases.tab_id values (1);
commit;

Command:

data-validation validate row -sc=ora_local -tc=ora_local \
--filter-status=fail -tbls=dms_test_cases.tab_id -pk=id -hash='id'

Exception:

sqlalchemy.exc.DatabaseError: (cx_Oracle.DatabaseError) ORA-00904: "STANDARD_HASH": invalid identifier
[SQL: SELECT t0.hash__all, t0.id
FROM (SELECT t1.id AS id, t1.cast__id AS cast__id, t1.ifnull__cast__id AS ifnull__cast__id, t1.rstrip__ifnull__cast__id AS rstrip__ifnull__cast__id, t1.upper__dvt_calc_col_0 AS upper__dvt_calc_col_0, t1.concat__all AS concat__all, lower(standard_hash(t1.concat__all, 'SHA256')) AS hash__all
FROM (SELECT t2.id AS id, t2.cast__id AS cast__id, t2.ifnull__cast__id AS ifnull__cast__id, t2.rstrip__ifnull__cast__id AS rstrip__ifnull__cast__id, t2.upper__dvt_calc_col_0 AS upper__dvt_calc_col_0, upper__dvt_calc_col_0 AS concat__all
FROM (SELECT t3.id AS id, t3.cast__id AS cast__id, t3.ifnull__cast__id AS ifnull__cast__id, t3.rstrip__ifnull__cast__id AS rstrip__ifnull__cast__id, upper(t3.rstrip__ifnull__cast__id) AS upper__dvt_calc_col_0
FROM (SELECT t4.id AS id, t4.cast__id AS cast__id, t4.ifnull__cast__id AS ifnull__cast__id, rtrim(t4.ifnull__cast__id) AS rstrip__ifnull__cast__id
FROM (SELECT t5.id AS id, t5.cast__id AS cast__id, coalesce(t5.cast__id, :param_1) AS ifnull__cast__id
FROM (SELECT t6.id AS id, CAST(t6.id AS VARCHAR2(4000 CHAR)) AS cast__id
FROM dms_test_cases.tab_id t6) t5) t4) t3) t2) t1) t0]

The problem expression is:

lower(standard_hash(t1.concat__all, 'SHA256')) AS hash__all

Oracle 11g:

select standard_hash('Hello', 'SHA256') from dual
       *
ERROR at line 1:
ORA-00904: "STANDARD_HASH": invalid identifier

Oracle > 11g:

SQL> select standard_hash('Hello', 'SHA256') from dual;

STANDARD_HASH('HELLO','SHA256')
----------------------------------------------------------------
185F8DB32271FE25F561A6FC938B2E264306EC304EDA518007D1764826381969

There is no standard alternative for 11g so I think our options are:

  1. Document that -hash is not valid for Oracle versions prior to 12c
  2. Throw an informational exception when -hash is used on Oracle versions prior to 12c
cofin commented 1 year ago

@nj1973 A potential option would be to implement an alternative SQLA standard_hash function for Oracle < 11g that calls ora_hash instead of standard_hash. I believe ora_hash existed in 10g, correct?

nj1973 commented 1 year ago

Yes that function is on 10g. I initially thought the same but it gives a different result to standard hash:

SQL> select standard_hash('Hello', 'SHA256') from dual;

STANDARD_HASH('HELLO','SHA256')
----------------------------------------------------------------
185F8DB32271FE25F561A6FC938B2E264306EC304EDA518007D1764826381969

SQL> select ora_hash('Hello') from dual;

ORA_HASH('HELLO')
-----------------
       3822733595
cofin commented 1 year ago

I should have guessed you'd already looked at this approach. Thanks @nj1973

viclinriv commented 2 months ago

Can I be assigned this issue?