shinichi-takii / ddlparse

DDL parase and Convert to BigQuery JSON schema and DDL statements
https://pypi.org/project/ddlparse/
BSD 3-Clause "New" or "Revised" License
86 stars 30 forks source link

Add support for Cloud Spanner DDLs #63

Open manuelzander opened 3 years ago

manuelzander commented 3 years ago

Several Spanner data types are currently not supported: https://cloud.google.com/spanner/docs/data-types

In particular, STRING (see https://github.com/shinichi-takii/ddlparse/pull/62), BYTES (see https://github.com/shinichi-takii/ddlparse/pull/64) and ARRAY.

ARRAY seems to be more difficult to add, see https://cloud.google.com/spanner/docs/data-types#array_type

ARRAY types are declared using the angle brackets (< and >).

I managed to solve the array issue by using Optional(Regex(r"\<(.*?)\>"))("array_brackets") within _CREATE_TABLE_STATEMENT

Additionally, Spanner DDLs can contain something like STRING(MAX), MAX needs to be supported in addition to numerical lengths.

Todos: Add spanner to DdlParse.DATABASE options. _CREATE_TABLE_STATEMENT needs to be adapted for Spanner. For arrays, Spanner uses something like ARRAY<BOOL>, but we also need to account for lengths indicated as string, for example BYTES(MAX))

I've tested with this DDL:

CREATE TABLE ManuelsTable (
  col1 INT64,
  col2 STRING(MAX),
  col3 TIMESTAMP,
  col4 DATE,
  col5 BYTES(MAX),
  col6 ARRAY<BOOL>,
  col7 BOOL,
  col8 FLOAT64,
  col9 NUMERIC,
) PRIMARY KEY(col1) 
shinichi-takii commented 3 years ago

@manuelzander Thank you for creating the Issue and the Pull Request!

I released version v1.9.0, includes PR #62 and #64.

ARRAY<>, STRUCT<> is not yet supported. Because I have to consider how to implement it. Please wait a little longer.

Thanks.