Expected:
Using the go client to send a message with a schema I should be able to use Pulsar SQL to read the message
Actual:
When I try this:
presto> select * from pulsar."public/default".simple;
Query 20211022_073804_00008_5x42n, FAILED, 1 node
http://pulsar-broker:8081/ui/query.html?20211022_073804_00008_5x42n
Splits: 18 total, 1 done (5.56%)
CPU Time: 0.0s total, 0 rows/s, 0B/s, 2% active
Per Node: 0.0 parallelism, 0 rows/s, 0B/s
Parallelism: 0.0
Peak Memory: 0B
0:00 [0 rows, 0B] [0 rows/s, 0B/s]
Query 20211022_073804_00008_5x42n failed: Internal error
java.lang.NullPointerException
at org.apache.pulsar.sql.presto.PulsarDispatchingRowDecoderFactory.createDecoderFactory(PulsarDispatchingRowDecoderFactory.java:69)
at org.apache.pulsar.sql.presto.PulsarDispatchingRowDecoderFactory.createRowDecoder(PulsarDispatchingRowDecoderFactory.java:58)
at org.apache.pulsar.sql.presto.PulsarRecordCursor.advanceNextPosition(PulsarRecordCursor.java:546)
at io.prestosql.spi.connector.RecordPageSource.getNextPage(RecordPageSource.java:90)
at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:302)
at io.prestosql.operator.Driver.processInternal(Driver.java:379)
at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
at io.prestosql.operator.Driver.processFor(Driver.java:276)
at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
at io.prestosql.$gen.Presto_332__testversion____20211022_052918_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
I built pulsar master head and 2.8.1 both have this problem. I am using the go client version 0.6.0.
The problem is in org.apache.pulsar.sql.presto.PulsarRecordCursor and appears to originate in getBytesSchemaInfo()
called here:
482: SchemaInfo schemaInfo = getBytesSchemaInfo(pulsarSplit.getSchemaType(), pulsarSplit.getSchemaName());
I added a log message and can see:
pulsarSplit.getSchemaType() returns JSON
pulsarSplit.getSchemaName() returns "public/default"
getBytesSchemaInfo() doesn't appear to handle a schema type of JSON - it returns null as schemaInfo.
This is then dererenced at line 493 and hence the NullPointerException.
This looks like a simple oversight but it is not clear to me how it should be fixed.
This is not a problem with 2.7.2 but I moved to 2.8.1 in that hope that avro schemas with arrays are finally supported. Unfortunately things are worse.
My go code reads a very simple schema from a file and creates the producer like this:
Send one message {"str":"abcdef", "num":123} and another long term bug is apparent: an sql select of the topic returns 0 rows.
Send another message and sql select gives the NullPointerException.
Original Issue: apache/pulsar#12465
Expected: Using the go client to send a message with a schema I should be able to use Pulsar SQL to read the message
Actual: When I try this:
I built pulsar master head and 2.8.1 both have this problem. I am using the go client version 0.6.0.
The problem is in org.apache.pulsar.sql.presto.PulsarRecordCursor and appears to originate in getBytesSchemaInfo() called here:
482: SchemaInfo schemaInfo = getBytesSchemaInfo(pulsarSplit.getSchemaType(), pulsarSplit.getSchemaName());
I added a log message and can see: pulsarSplit.getSchemaType() returns JSON pulsarSplit.getSchemaName() returns "public/default"getBytesSchemaInfo() doesn't appear to handle a schema type of JSON - it returns null as schemaInfo. This is then dererenced at line 493 and hence the NullPointerException.
This looks like a simple oversight but it is not clear to me how it should be fixed.
This is not a problem with 2.7.2 but I moved to 2.8.1 in that hope that avro schemas with arrays are finally supported. Unfortunately things are worse.
My go code reads a very simple schema from a file and creates the producer like this:
schema is:
Send one message {"str":"abcdef", "num":123} and another long term bug is apparent: an sql select of the topic returns 0 rows. Send another message and sql select gives the NullPointerException.