databricks / spark-redshift

Redshift data source for Apache Spark
Apache License 2.0
607 stars 348 forks source link

Should spark-redshift handle ALTERs to create new columns? #322

Open tejasmanohar opened 7 years ago

tejasmanohar commented 7 years ago

From browsing code (but correct me if I'm missing something), it seems spark-redshift assumes the column structure won't change. I would think that Spark Redshift would automagically run ALTERs where possible (e.g. column additions). What do you think?

vnktsh commented 7 years ago

Yes, I think the connector should get this capability. We are doing some complex ETL on log files and the params keep changing time to time, right now we use external tools to do alter tables and then load.