Closed AliTajeldin closed 7 years ago
Increasing priority as this seems to also be the only way to change hive credentials (See #704)
Branch i705_jdbc now has support for
For both uses, I have delegated to Spark's DataFrameWriter
. Based on discussion with @AliTajeldin, when writing through JDBC I have set DataFrameWriter's
save mode to SaveMode.Append
, which means that DataFrames
should be inserted into tables instead of overwriting them. However, this behavior doesn't seem entirely consistent - I was successfu;l inserting to existing MySQL
tables over JDBC, but when I tried to do the same with Derby tables there was a failure that implied that JDBC was trying to create the table. As the use-case inspiring this feature is to write to Hive over JDBC, we should verify that the insertion works as expected for Hive before I make a PR.
Another feature we should consider adding is the option of custom user-specified query, similar to SmvHiveTable
. I will investigate how we might accomplish this on Monday.
While sqoop is great at parallel export of hdfs files to databases, it would be convenient for users to use an smv level
--publish-jdbc
or something similar.Note: We would require a config that determines the flavor of the actual DB used as the DDL for SQL databases is not really standard.