Open Yomanz opened 3 months ago
Does the SELECT * FROM table
query work if you fully specify the catalog and schema?
No, if I select the Catalog and Schema and run SELECT * FROM table LIMIT 100;
, I get:
java.lang.NullPointerException: string is null
at java.base/java.util.Objects.requireNonNull(Objects.java:246)
at io.airlift.slice.Slices.copiedBuffer(Slices.java:291)
at io.airlift.slice.Slices.utf8Slice(Slices.java:299)
at com.facebook.presto.druid.DruidBrokerPageSource.getNextPage(DruidBrokerPageSource.java:154)
at com.facebook.presto.operator.TableScanOperator.getOutput(TableScanOperator.java:266)
at com.facebook.presto.operator.Driver.processInternal(Driver.java:441)
at com.facebook.presto.operator.Driver.lambda$processFor$10(Driver.java:324)
at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:750)
at com.facebook.presto.operator.Driver.processFor(Driver.java:317)
at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1079)
at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:165)
at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:621)
at com.facebook.presto.$gen.Presto_0_288_15f14bb____20240818_134447_1.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)```
However, I can query the information_schema
TABLES
table just fine.
@Yomanz I think these are probably simple fixes. However, unfortunately the Druid connector doesn't have end to end test queries in our CI. This is because, at the time, the Druid community did not provide an embedded instance of Druid that we could load up to run the end to end tests. (See https://github.com/prestodb/presto/pull/14042#issuecomment-582218710, CC @zhenxiao). I think before we fix these issues, we should make sure that we get a sensible baseline in CI, so that it doesn't regress at a later time. So I think we should look into wether or not the Druid community now has the capability to embed a Druid instance inside a running JVM. If so, I think we should add end to end tests.
This might sound like a lot of work, but actually it's not that hard to write the tests themselves, as we provide a framework for connectors to use that tests queries along a whole range of data from TPCH.
@Yomanz would you like to take this issue on? Otherwise, we can see if the community can pick this up.
@Yomanz I think these are probably simple fixes. However, unfortunately the Druid connector doesn't have end to end test queries in our CI. This is because, at the time, the Druid community did not provide an embedded instance of Druid that we could load up to run the end to end tests. (See #14042 (comment), CC @zhenxiao). I think before we fix these issues, we should make sure that we get a sensible baseline in CI, so that it doesn't regress at a later time. So I think we should look into wether or not the Druid community now has the capability to embed a Druid instance inside a running JVM. If so, I think we should add end to end tests.
This might sound like a lot of work, but actually it's not that hard to write the tests themselves, as we provide a framework for connectors to use that tests queries along a whole range of data from TPCH.
@Yomanz would you like to take this issue on? Otherwise, we can see if the community can pick this up.
If I can figure out a way to get it running easily I'll take a crack, if not will have to revert to the community 🤞
Your Environment
Expected Behavior
These queries should return the correct data from Apache Druid
Current Behavior
Running the query:
via the JDBC driver.
Or running:
SELECT * FROM table
on the SQL CLIENT also gives the same error.Error:
Possible Solution
I believe it might have something to do with the Apache Druid data having some rows with empty/nulled columns
Steps to Reproduce
Screenshots (if appropriate)
N/A
Context
Unable to use PrestoDB with tools such as Hex.tech as it cant get the schema.