dremio-hub / dremio-flight-connector

Dremio Flight connector. Access Dremio using Arrow flight
40 stars 8 forks source link

Sample code on the readme doesn't work out #4

Closed murinata closed 5 years ago

murinata commented 5 years ago

We have a Dremio 3.3.1 cluster with dremio-flight-connector-0.8.0-shaded.jar copied to jars folder. The driver was able to be picked up during dremio restart, tested that the port 47470 is listening and we are able to access the port (confirmed by telnet to the port).

Installed pyarrow and dependency (pyarrow-0.14.1-py27h8b68381_0) We run jupyter-lab and run the sample code by changing the user account and coordinator IP. Change the sql command (run the query on DBeaver and it return the data)

However the python code is stopped on :

---------------------------------------------------------------------------
ArrowIOError                              Traceback (most recent call last)
<ipython-input-1-4b75bab60fd9> in <module>
     30 flightDesc = flight.FlightDescriptor.for_command(sql)
     31 print(flightDesc.command)
---> 32 info = client.get_flight_info(flightDesc)
        global info = undefined
        global client.get_flight_info = <built-in method get_flight_info of pyarrow._flight.FlightClient object at 0x7f360b156120>
        global flightDesc = <FlightDescriptor type: <DescriptorType.CMD: 2>>
     33 
     34 reader = client.do_get(info.endpoints[0].ticket)

~/Apps/anaconda3/lib/python3.7/site-packages/pyarrow/_flight.pyx in pyarrow._flight.FlightClient.get_flight_info()

~/Apps/anaconda3/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()

ArrowIOError: gRPC failed with error code 2 and message: 
> /home/tmurinata/Downloads/pyarrow/error.pxi(87)pyarrow.lib.check_status()

So within the framework how do you suggest to debug the problem.

Thank you

rymurr commented 5 years ago

hey @murinata taking a look now. Did the driver run a 'prepare' statement and did it succeed? You should be able to tell from the 'Jobs' pane on the UI.

Is there anything meaningful in server.log or server.out logfiles from the Dremio server?

rymurr commented 5 years ago

You may find https://github.com/rymurr/dremio_client has a better flight client (it abstracts away the flight specific stuff). Most direct way to use it would be

from dremio_client.flight import query
df = query(sql, hostname=hostname, port=port, username=username, password=password)
rymurr commented 5 years ago

I was not able to reproduce this on Dremio CE 3.3.1 with the connector build from https://github.com/dremio-hub/dremio-flight-connector/releases/download/v3.3.1/dremio-flight-connector-0.8.0-shaded.jar and the dremio_client referenced above using linux (Have not tested windows or mac)

murinata commented 5 years ago

The prepare statement and the query is completed in the UI.

On the server.log we got this message.


2019-09-04 09:39:56,981 [pool-15-thread-4] INFO  com.dremio.flight.AuthValidator - authenticated tanaka
2019-09-04 09:39:56,999 [ForkJoinPool-1-worker-29] INFO  com.dremio.flight.Producer - called get flight info
2019-09-04 09:39:58,810 [22907992-cad9-ed9f-ed2f-ab0d8bd75e00/0:foreman-planning] WARN  c.d.e.e.s.d.p.ParquetFilterPushDownRule - Failure converting condition. [com.drem
io.exec.store.parquet.ParquetFilterCondition$FilterProperties@42827bcc]
java.lang.ClassCastException: java.math.BigDecimal cannot be cast to org.apache.calcite.util.NlsString
        at com.dremio.extra.exec.store.dfs.parquet.ParquetFilterPushDownRule.vec(ParquetFilterPushDownRule.java:337) ~[dremio-ce-sabot-kernel-3.3.1-201907291852280797-df
23756.jar:3.3.1-201907291852280797-df23756]
        at com.dremio.extra.exec.store.dfs.parquet.ParquetFilterPushDownRule.onMatch(ParquetFilterPushDownRule.java:174) ~[dremio-ce-sabot-kernel-3.3.1-20190729185228079
7-df23756.jar:3.3.1-201907291852280797-df23756]
.....

java.lang.ClassCastException: java.math.BigDecimal cannot be cast to org.apache.calcite.util.NlsString
        at com.dremio.extra.exec.store.dfs.parquet.ParquetFilterPushDownRule.vec(ParquetFilterPushDownRule.java:337) ~[dremio-ce-sabot-kernel-3.3.1-201907291852280797-df
23756.jar:3.3.1-201907291852280797-df23756]
        at com.dremio.extra.exec.store.dfs.parquet.ParquetFilterPushDownRule.onMatch(ParquetFilterPushDownRule.java:174) ~[dremio-ce-sabot-kernel-3.3.1-20190729185228079
7-df23756.jar:3.3.1-201907291852280797-df23756]

However it It works on the query to another data set.

Will try dremio-client and check for the result.

Thank you

rymurr commented 5 years ago

Does the query that fails in flight work in the odbc/jdbc driver or in the UI? I guess so if you said it works in dbeaver. I will test the flight connectors handling of BigDecimal. Can you share the schema of the table that is failing?

murinata commented 5 years ago

The query works on Dremio UI, and via JDBC. But just doesn't work on flight. We tried on some other data. And we found that the flight connector only works on Table but it didn't work on View (VDS). The query that worked was Table, but we tested some other View but all didn't work.

This is the schema: schema

Thank you very much

rymurr commented 5 years ago

Hi @murinata I have added Decimal support to the Flight connector. If you download teh binary from the releases page and give it another try it should work.