Tables loaded from the Delius Oracle database into AWS S3 using AWS Data Migration Service are stored in Parquet format. During load, the Oracle data type number is converted to the data type fixed_len_byte_array and the logical type Decimal(precision=X, scale=Y).
A Python script is used to generate table metadata from Oracle into JSON. number type columns are extracted as int (if number scale = 0) or double (if number scale > 0).
To ensure the data are queryable in Athena, columns with int and double types in the metadata are converted to Decimal(precision=X, scale=Y). This causes ETL manager to throw an error when it is used to read and structure the metadata it in a way that can be read as a Glue database and tables.
Background:
number
is converted to the data typefixed_len_byte_array
and the logical typeDecimal(precision=X, scale=Y)
.number
type columns are extracted asint
(if number scale = 0) ordouble
(if number scale > 0).int
anddouble
types in the metadata are converted toDecimal(precision=X, scale=Y)
. This causes ETL manager to throw an error when it is used to read and structure the metadata it in a way that can be read as a Glue database and tables.