finos / perspective

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
https://perspective.finos.org/
Apache License 2.0
7.72k stars 1.04k forks source link

Support extract Apache Arrow Custom Metadata from schema #2637

Closed brettsaunders21 closed 2 weeks ago

brettsaunders21 commented 2 weeks ago

Feature Request

Description of Problem:

When using apache arrow you can store custom metadata on the schema. We use this to store extra options and config for our table which we want to extract and use to apply custom rendering and styling.

Other formats built arrow apache arrow can also use this to store extra custom metadata like Geo Spatial libraries that use it to list columns with certain details

Currently we have no way to access this metadata, as our Apache Arrow table is sent and consumed by @finos/perspective package with LZ4 compression. No javascript implementation of these compression algorithms for apache arrow exists, so we can't easily get access to this metadata.

If we use WASM to do this our bundle sizes and code need would go up a lot, leading to slower load times of our app

Potential Solutions:

The @finos/perspective package "Table" class can have a method on to get Custom Metadata. For CSV and JSON this can just be an empty object.

The custom metadata can be extracted from the Apache Arrow schema https://github.com/apache/arrow/blob/7dd1d34074af176d9e861a360e135ae57b21cf96/js/src/schema.ts#L24

timkpaine commented 2 weeks ago

Perspective schemas are not Apache Arrow schemas.

texodus commented 2 weeks ago

Perspective schemas are not Apache Arrow schemas.

Not sure this is an adequate answer here, nor that this is precisely what was asked, so I'd like to elaborate a bit. Perspective it not currently intended to be a general-purpose Arrow processing for other libraries to call into, it is instead stripped-down for visualization specifically. We omit a lot of Arrow code from our build for the exact reason you specify, because it would inflate Perspective's asset size a lot to include it (and also some historical reason related to Arrow's wasm compatibility that I believe are no longer relevant).