tagomoris / presto-client-node

Distributed query engine Presto client library for node.js
MIT License
126 stars 57 forks source link

columns callback but no data callback #62

Closed peebles closed 2 years ago

peebles commented 2 years ago

What does it mean that I get the columns() callback with stuff that looks ok, but the data() callback is never called? The final "callback()" is called, with no error.

I can use hive-driver to do the same query on my hadoop server. I launch the presto server with a hive.properties in catalogs that looks like:

connector.name=hive-hadoop2 hive.metastore.uri=thrift://hadoop-master:9083 hive.s3.ssl.enabled=false hive.s3.path-style-access=true hive.s3.endpoint=http://moto-server:5000

My presto client:

client = new presto.Client({ schema: 'default', catalog: 'hive', source: 'nodejs-client', });

Do I need something else for "schema"?

My table in hive looks like

CREATE EXTERNAL TABLE `chunks`(
 ... )
ROW FORMAT SERDE
  'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
  'paths'='attributes,code,data,fwVersion,shoe,timestamp,ts,tz,user')
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3a://zc-bigdata/'
peebles commented 2 years ago

I solved my own problems. Firstly I had to organize my S3 data into "partitions" using Hive S3 key naming conventions, and add this information to the EXTERNAL TABLE with PARTITIONED BY. At that point something was happening, but I still needed to add the openx .jar files to presto/plugin/hive-* to get results.

I'm ok now.