Some databases such as Teradata use infer their data type minimally based on the literal value provided. This is problematic for the WITH recon as () query generated by the framework to sample data that is mismatched or missing.
Consider this statement that would be generated by the framework:
WITH recon AS (
SELECT
4.00 AS some_value -- Inferred as decimal(1,2) by TD
FROM dual
UNION
SELECT
1222.34 AS some_value --Throws numeric overflow exception as it doesn't fit decimal(1,2)
FROM dual
)
select *
from recon;
Because the first value of the WITH recon statement is lower than the next ones in the UNION statement, resulting data type will be inferred as decimal(1,2) by Teradata engine, and the query will fail after the union since the next value doesn't fit in this data type.
Teradata error thrown: [Teradata Database] [Error 2616] Numeric overflow occurred during computation.
Oracle won't throw this exception because it infers data types in derived tables with unions differently.
Proposed Solution
Cast every column generated (only primary keys are generated this way) to their original data type, unless there is a user_transformation existing on the column, as this could possibly change the desired data type depending on the user transformation. Note that user transformations are pre-applied to the literal value generated. So we can't apply them here again.
Is there an existing issue for this?
Category of feature request
Reconcile
Problem statement
Some databases such as Teradata use infer their data type minimally based on the literal value provided. This is problematic for the
WITH recon as ()
query generated by the framework to sample data that is mismatched or missing.Consider this statement that would be generated by the framework:
Because the first value of the
WITH recon
statement is lower than the next ones in the UNION statement, resulting data type will be inferred as decimal(1,2) by Teradata engine, and the query will fail after the union since the next value doesn't fit in this data type. Teradata error thrown: [Teradata Database] [Error 2616] Numeric overflow occurred during computation. Oracle won't throw this exception because it infers data types in derived tables with unions differently.Proposed Solution
Cast every column generated (only primary keys are generated this way) to their original data type, unless there is a user_transformation existing on the column, as this could possibly change the desired data type depending on the user transformation. Note that user transformations are pre-applied to the literal value generated. So we can't apply them here again.
Proposed fix: https://github.com/databrickslabs/remorph/commit/b5786ff665a6d5bbc2ece1d6d01ffb3764609e17
Additional Context
No response