Operation like splitobs, shuffleobs and many more return ObsViews that one has to call getobs on in order to materialize.
I think this is unexpected for users coming from scikit-learn and mildly annoying in most scenarios.
As a default, operations on materialized objects should return materialized objects (e.g. arrays and dataframes).
Users will be able to opt-in on the "lazy" by wrapping data in a ObsView. Operations on ObsView will produce other ObsView that can be materialized only at the end of the pipeline.
Makes a lot of sense to me. Maybe we should rename ObsView to LazyView to indicate that it is both a view (subset) of the observations as well as being lazy.
Operation like
splitobs
,shuffleobs
and many more returnObsView
s that one has to callgetobs
on in order to materialize. I think this is unexpected for users coming from scikit-learn and mildly annoying in most scenarios. As a default, operations on materialized objects should return materialized objects (e.g. arrays and dataframes). Users will be able to opt-in on the "lazy" by wrapping data in aObsView
. Operations on ObsView will produce other ObsView that can be materialized only at the end of the pipeline.