mrchristine / db-migration

Databricks Migration Tools
Other
43 stars 27 forks source link

DDL's getting truncated when metadata is extrated. #14

Closed arjun-hareendran closed 4 years ago

arjun-hareendran commented 4 years ago

Hello,

I am using this utility to do a dump of all the DDL from the databricks cluster. What i observed is that when the DDL are huge the DDL statement gets truncated.

*** WARNING: skipped 23006 bytes of output ***

USING parquet
OPTIONS (
  path 'dbfs:/XX/XXX/XXXXX'
)
PARTITIONED BY (XXX)

Because of the truncating my DDL statement fails . Is there any parameter that can resolve such errors ?

Note: The table has over 700 columns with comments included.

mrchristine commented 4 years ago

@arjun-hareendran thanks for reporting. Let me work on a batching operation and get back to you.

mrchristine commented 4 years ago

@arjun-hareendran I have the export working, but the import will take some more time to redesign. I'll need to stage the DDL in a tmp location first, then read from there to handle the larger DDL.

I can commit the export code now if that's more important, or I can commit both fixes once I'm done. Let me know your preference.

mrchristine commented 4 years ago

@arjun-hareendran I've added paging DDL exports now and it should support very large outputs. I'll follow up an enhancement task to support the import. I'll close this one for now since exports are now working.