madhusss333 / Sparklyr

0 stars 0 forks source link

Sparklyr/Import file #1

Closed madhusss333 closed 6 years ago

madhusss333 commented 6 years ago

Hi All,

i am working on server and using R studio to connect big data using spark.

i have established connection as below,

install.packages("sparklyr")

library(sparklyr) spark_install(version = "2.3.1")

sc <- spark_connect(master = "local").

After this i am trying to read a parquet file as well csv.gz file but getting errors, can please any of u look into it. I feel there is something missing in Path i have tried with //(double) as well as ///(triple) slash.

table_par <- spark_read_parquet(sc, "par", "hdfs://tmp/ATTMX_UserPlane_imsi.parquet")

table_csvl <- spark_read_csv(sc, "csv", 'hdfs://tmp/ATTMX_1000imsi/sub_kpi_daily/agg_d_sub_kpi_f_1000imsi.csv.gz').

Please advise, Help is greatly appreciated.

Thanks

madhusss333 commented 6 years ago

I am attaching the error for the code,

temp_par <- spark_read_parquet(sc, "par", "hdfs://tmp/ATTMX_UserPlane_imsi.parquet")

Error: java.lang.IllegalArgumentException: java.net.UnknownHostException: tmp at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176).

spk_tbl <- spark_read_csv(sc, "abc", 'hdfs://tmp/ATTMX_1000imsi/sub_kpi_daily/agg_d_sub_kpi_f_1000imsi.csv.gz')

Error: java.lang.IllegalArgumentException: java.net.UnknownHostException: tmp at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)

--