Open teh-cmc opened 11 months ago
re_arrow2
has an arrow
feature, with glue for converting data between arrow
and re_arrow2
: https://docs.rs/re_arrow2/0.17.4/re_arrow2/array/trait.Arrow2Arrow.html
Using that we can start this migration piece-wise. It would have double the dependencies for a transitionary period, leading to longer compilation times and bigger .wasm binary, but I think that is an ok tradeoff.
Potential roadmap:
Arrow2Arrow
is zero-copy
SizeBytes
to own crate, with separate arrow
and arrow2
feature flagsto_arrow/from_arrow/…
to to_arrow2/from_arrow2/…
to_arrow/from_arrow
using the glueAfter de-chunkfification:
As of 2024-07-08, there are only around 300 lines of Rust referencing the string arrow2
directly, when one ignores generated code.
arrow
Blocked on:
New blocker:
Currently blocked on:
Multiple end-goals:
1809
RERUN:component_name
)3360
half
forf16
TODO (split into sub-issues as needed):
arrow2
(codegen, data{cell,row,table},ArrowBuffer
, etc)arrow1
RERUN:component_name
(#3360)DataCell::component_name
On the way there we might hit a few bumps because we have a lot of redundant ad-hoc code that integrates with
polars
(which is built on top ofarrow2
).The solution to this is to make sure we only integrate with
polars
in one single place: theData{Cell,Row,Table}
layer (https://github.com/rerun-io/rerun/issues/1692). Once that's done, we can remove all ad-hoc polars code everywhere and just build aData{Row,Cell,Table}
anytime we want apolars::Series
/polars::DataFrame
(#1759).Internally, the conversion from
DataTable
topolars::DataFrame
will require a zero-copy tri-stage conversion fromarrow1
->arrow2
->polars
.