deephaven / web-client-ui

Deephaven Web Client UI
Apache License 2.0
29 stars 31 forks source link

Partition-selection drop-downs don't work properly for non-String columns #1441

Closed rcaudy closed 1 year ago

rcaudy commented 1 year ago

Description

When using non-String partitioning columns (long in my example), the partition-selection drop-down incorrectly tries to filter as if the values are strings. It wraps them in quotes, causing MatchFilter parsing to reject the literal.

Steps to reproduce

Use a command like:

t=io.deephaven.parquet.table.ParquetTools.readTable(path)

to load a Parquet table that uses metadata files, a TableDefinition, or inferred partitioning column types from key=value directory name pairs. If any of the partitioning column types is not a String, you should see this error or similar.

Note that the automatic first-partition retrieval fails, as does selecting a partition to filter to. coalesce() or where() on the table to skip this servers as a workaround.

Expected results

The table to render properly.

Actual results

A table widget with an error loaded. See screeenshot.

Additional details and attachments

Here's the exception stack trace from the server-side, as displayed in the console:

heduler-Concurrent-3 | i.d.s.s.SessionService    | Internal Error 'ef498c1c-f351-4b5a-9f2b-da5f85b0e713' java.lang.IllegalArgumentException: Failed to convert literal value <"100"> for column "cnum" of type long
    at io.deephaven.engine.table.impl.select.MatchFilter.init(MatchFilter.java:155)
    at io.deephaven.engine.table.impl.PartitionAwareSourceTable.whereImpl(PartitionAwareSourceTable.java:288)
    at io.deephaven.engine.table.impl.PartitionAwareSourceTable.where(PartitionAwareSourceTable.java:272)
    at io.deephaven.engine.table.impl.PartitionAwareSourceTable.where(PartitionAwareSourceTable.java:35)
    at io.deephaven.server.table.ops.FilterTableGrpcImpl.create(FilterTableGrpcImpl.java:57)
    at io.deephaven.server.table.ops.FilterTableGrpcImpl.create(FilterTableGrpcImpl.java:30)
    at io.deephaven.server.table.ops.TableServiceGrpcImpl$BatchExportBuilder.doExport(TableServiceGrpcImpl.java:689)
    at io.deephaven.server.table.ops.TableServiceGrpcImpl.lambda$batch$5(TableServiceGrpcImpl.java:541)
    at io.deephaven.server.session.SessionState$ExportObject.doExport(SessionState.java:921)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at io.deephaven.server.runner.scheduler.SchedulerModule$ThreadFactory.lambda$newThread$0(SchedulerModule.java:78)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NumberFormatException: For input string: ""100""
    at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.base/java.lang.Long.parseLong(Long.java:678)
    at java.base/java.lang.Long.parseLong(Long.java:817)
    at io.deephaven.engine.table.impl.select.MatchFilter$ColumnTypeConvertorFactory$4.convertStringLiteral(MatchFilter.java:229)
    at io.deephaven.engine.table.impl.select.MatchFilter.init(MatchFilter.java:152)
    ... 15 more

Here's a screenshot of the issue: Screenshot 2023-08-08 at 9 43 32 AM

Versions

Version 0.27.0

vbabich commented 1 year ago

@rcaudy Do you have a sample file to make it easier to reproduce?

nbauernfeind commented 1 year ago

@vbabich here is a reproducer I wrote:

import io.deephaven.engine.util.TableTools
import io.deephaven.parquet.table.ParquetTools

// write a partition table with an int partition column
def part = TableTools.emptyTable(4).update("II = ii")
ParquetTools.writeTable(part, "/tmp/pt-test/intCol=0/part.parquet")
ParquetTools.writeTable(part, "/tmp/pt-test/intCol=1/part.parquet")

// load the partition table
partition_table = ParquetTools.readTable("/tmp/pt-test")

// dump the metadata to the console to see that it is indeed an `intCol`
TableTools.show(partition_table.meta())
mofojed commented 1 year ago

Same snippet in Python instead:

from deephaven import empty_table

part = empty_table(4).update("II=ii")

from deephaven.parquet import write, read

write(part, "/tmp/pt-test/intCol=0/part.parquet")
write(part, "/tmp/pt-test/intCol=1/part.parquet")

partition_table = read("/tmp/pt-test")