Since parquet Io strategy can be used for SparkDf module persistence in addition to SmvCsv, we should add and prop and allow user to config which to use as default (in case user modules didn't specify a persistStrategy).
This issue will introduce a smv.sparkdf.defaultPersistFormat parameter with 2 possible values:
parquet_on_hdfs
smvcsv_on_hdfs
Will still keep the smvcsv_on_hdfs as default for this issue.
Since parquet Io strategy can be used for SparkDf module persistence in addition to SmvCsv, we should add and prop and allow user to config which to use as default (in case user modules didn't specify a persistStrategy).
This issue will introduce a
smv.sparkdf.defaultPersistFormat
parameter with 2 possible values:parquet_on_hdfs
smvcsv_on_hdfs
Will still keep the
smvcsv_on_hdfs
as default for this issue.