Closed tb0n3zz closed 4 years ago
Figured I'd share the other thing I ran into and workaround -- preprocessing input DDL from Oracle as follows. Replacing with (38,0)
makes (*,0)
correctly convert to an INT64
.
# remove (*,0) from NUMBER defs that are integers anyway (38 is the max in variable data type, gets replaced anyway)
massaged_ddl = re.sub(r'NUMBER\(\*,0\)', r'NUMBER(38,0)', oraddl)
# chokes on this, tries to process like a column
massaged_ddl = re.sub(r'SUPPLEMENTAL LOG DATA \(.*\) COLUMNS.*\n', '', massaged_ddl)
@tb0n3zz Thank you for using, and a issue report. I fixed issue to v1.4.0 .
Very excited to find this -- just what I need, so thank you very much for creating it! I created a test to directly call Oracle 12.2.0.1 and run dbms_metadata.get_ddl, which is the standard method for producing DDL from the Oracle database using the CLI. It produces
CREATE TABLE
statements like this, which I then ran through ddlparse to convert to BigQuery -- but I found that most of the columns were missing in the resulting sql. Here's a simple runnable example where I removed the(*,0)
from the firstNUMBER
column (COL2_WORKS
):Note the
NUMBER(*,0
designation. I found in testing in on many tables that as soon as ddlparse hits anyNUMBER(*,0)
data type column, it stops processing any more columns, so I get BigQuery DDL like this:I stepped through the code/breakpoints enough to understand that the issue is happening in the original column parsing, not in the conversion to BQ DDL. The BQ conversion is only called for the 4 columns.
I suspect the
*
is breaking a regex in ddlparse or pyparse. I'm out of time to figure out where, and I'd imagine you will be a lot faster at it since you are familiar with how you're using them, but I wanted to share this much since I got this far. For now, easily fixed by just preprocessing my OracleCREATE TABLE
strings to remove anything with the*
.Additional note: it is turning
NUMBER(*,0)
into a FLOAT64 whenNUMBER(*,0)
is an integer - but perhaps that's because of the broken regex as well / it's being cut off.Thanks again for the code!