apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine
https://datafusion.apache.org/ballista
Apache License 2.0
1.49k stars 190 forks source link

Execution error: Unable to find factory for PARQUET #839

Open dadepo opened 1 year ago

dadepo commented 1 year ago

Describe the bug Trying to access Ballista via the flight_sql_jdbc_driver does not seem to work. Specifically trying to run the create external table fails.

To Reproduce

I am using the flight_sql_jdbc_driver that can be download here https://mvnrepository.com/artifact/org.apache.arrow/flight-sql-jdbc-driver. I have tried with version 10, 11 and 12 and I get same result.

I confirmed I can connect, but when I run the query to register a parquet file

Statement st = conn.createStatement(); st.executeQuery("create external table tripdata stored as PARQUET location '/Users/me/testdata/yellow_tripdata_2022-01.parquet'");

I get the error

Error while executing SQL "create external table tripdata stored as PARQUET location '/Users/me/testdata/yellow_tripdata_2022-01.parquet'": Error building plan: Execution error: Unable to find factory for PARQUET

Are there any other libs I need to have in my class path for this to work?

Expected behavior

I should be able to connect and execute queries

Additional context

 ballista-executor --version
Ballista version: 0.11.0

ballista-scheduler --version
Ballista version: 0.11.0