finos / perspective

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
https://perspective.finos.org/
Apache License 2.0
7.72k stars 1.05k forks source link

Arrow decimal type getting read as integer #2582

Open droher opened 3 months ago

droher commented 3 months ago

Bug Report

Steps to Reproduce:

Run this code in jupyterlab (duckdb version was 0.10.1 but should work on any recent version)

import perspective
import duckdb
import pyarrow as pa
import io

table = duckdb.sql("SELECT 3.14::DECIMAL AS decimal_col, 3.14::FLOAT AS float_col").arrow()
buffer = io.BytesIO()

with pa.ipc.new_file(buffer, table.schema) as writer:
    writer.write_table(table)

bytes_object = buffer.getvalue()

perspective_table = perspective.Table(bytes_object)
perspective.PerspectiveWidget(perspective_table)

Expected Result:

Values should be the same image

Actual Result:

Decimal value is deserialized as the internal Arrow representation of a decimal type (an integer) image

Environment:

perspective-python 2.10.0 running on Python 3.11, jupyterlab 3.6.5, MacOS 14.4.1.

Additional Context:

I first noticed this issue in the JS package but it was easier for me to create a replicable example in Python.

timkpaine commented 2 months ago

This is explicit, unsure if it's intentional https://github.com/finos/perspective/blob/055d5b3beb4599dda8f007f73923e8d451de4098/cpp/perspective/src/cpp/arrow_loader.cpp#L123

texodus commented 2 months ago

Perspective doesn't have internal support for decimal yet. We can change this behavior to cast to float instead which is probably more useful (for the purposes of data visualization at least).