ZSI-Bio / bdg-sparklyr-sequila

Apache License 2.0
1 stars 1 forks source link

Create table as select does not work #10

Open mwiewior opened 5 years ago

mwiewior commented 5 years ago

Steps to reproduce:

ss <- sequila_connect(master="local[1]", driver_memory="2g", spark_home="/Users/marek/tools/spark-2.2.1-bin-hadoop2.7")

create db

sequila_sql(ss,query="CREATE DATABASE sequila") sequila_sql(ss,query="USE sequila")

create a BAM data source with reads

sequila_sql(ss, query="drop table reads") ss$sc. sequila_sql(ss,'reads',"CREATE TABLE reads USING org.biodatageeks.datasources.BAM.BAMDataSource OPTIONS(path '/Users/marek/Downloads/c1_10M.bam')") sequila_sql(ss,'read_10', 'CREATE TABLE reads_10 stored as parquet AS SELECT * FROM reads limit 10')

Result: Error: org.apache.spark.sql.AnalysisException: Hive support is required to CREATE Hive TABLE (AS SELECT);; 'CreateTable reads_10, org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, ErrorIfExists +- GlobalLimit 10 +- LocalLimit 10 +- Project [sampleId#32, contigName#33, start#34, end#35, cigar#36, mapq#37, baseq#38, reference#39, flags#40, materefind#41] +- SubqueryAlias reads +- Relation[sampleId#32,contigName#33,start#34,end#35,cigar#36,mapq#37,baseq#38,reference#39,flags#40,materefind#41] org.biodatageeks.datasources.BAM.BAMRelation@9f05f24

Possible Cause: sparklyr spark version is not compiled with a hive support