Open wush978 opened 8 years ago
Under dplyr v 0.4.3, the as.numeric might fail.
0.4.3
as.numeric
After executing the following R script:
dplyr::group_by(df, day) %>% dplyr::summarise(imp = count(adid), clk = mean(as.numeric(is_click))) %>% dplyr::collect()
R will raise an error from spark: (org.apache.spark.sql.AnalysisException: cannot recognize input near 'numeric' ')' ')' in primitive type specification;
(org.apache.spark.sql.AnalysisException: cannot recognize input near 'numeric' ')' ')' in primitive type specification;
I fixed this issue according to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes in https://github.com/bridgewell/dplyrSparkSQL/commit/781fba15b034686d9637e79378a90686434f63ef . It seems that this package does not add these customized translator (https://github.com/RevolutionAnalytics/dplyr-spark/blob/e073c607970ce8a44088e3fa99c52a9cab7163e5/pkg/R/src-sparkSQL.R#L114)
Hope this helps.
Under dplyr v
0.4.3
, theas.numeric
might fail.After executing the following R script:
R will raise an error from spark:
(org.apache.spark.sql.AnalysisException: cannot recognize input near 'numeric' ')' ')' in primitive type specification;
I fixed this issue according to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes in https://github.com/bridgewell/dplyrSparkSQL/commit/781fba15b034686d9637e79378a90686434f63ef . It seems that this package does not add these customized translator (https://github.com/RevolutionAnalytics/dplyr-spark/blob/e073c607970ce8a44088e3fa99c52a9cab7163e5/pkg/R/src-sparkSQL.R#L114)
Hope this helps.