Open madsbk opened 1 week ago
The main challenge here would be with children. If I call from_column_view(cv, arbitrary)
and cv has children, what should their owners be? The current logic is quite strict in assuming that the owning column owns the corresponding buffer for that child's data. How should this work in general? If it shouldn't, should it be forbidden? Should we assume that if we are ingesting views containing nested types that we must be coming from a Column?
In the cudf-classic model, if we have an arbitrary object that we can't decompose, that becomes the owner for every child.
In the cudf::unpack
case I think we just have to follow that model if we want to avoid a copy. Every column (and null mask) that comes out of cudf::unpack
is backed by the same single allocation, so the owner of every child is the same object.
I agree that's probably the best that we can do. Perhaps a fused type of Column | object
such that we take the smarter child path for Column and the more naive "all children are owned by owner" path for everything else.
In https://github.com/rapidsai/cudf/pull/17012, we have a
table_view
but no owningTable
thus it would be useful to be able to create apylibcudf.Table
from atable_view
and an arbitrary owning object (in this case aPackedColumns
instance).Similarly, it should be possible to create a
pylibcudf.Column
from acolumn_view
and an arbitrary owning object.cc. @wence-