Remove support for nullable components

emilk commented 3 months ago

At the moment we don’t ever make use of nullable components. For instance: we don’t allow setting the radius of only some points in a point-cloud to null.

Still, we have full support for it in our Loggable trait and in our codegen. We have a to_arrow_opt taking an iterator of optional values, and a from_arrow_opt that returns a Vec<Option<Self>>, yet these are never called.

This unused support has several downsides:

Extra complexity in the codegen
A lot of extra generated code (we generate both from_arrow_opt and from_arrow)
Runtime overhead (we codegen to_arrow_opt, but only ever call to_arrow which wraps each element in Some(…))

Removing support for it would definitely make the transition away from arrow2 easier. And we can always add it back in the future.

It seems silly to support a feature we may want in the future (YAGNI / ease into complexity).

jleibs commented 3 months ago

Just to clarify a bit further, this would make it so that the inner values of arrow batches are never null. However the outer list-arrays itself would still have nullable arrays if an optional value wasn't provided in a row.

Maybe this should be articulated as: "Remove support for individually nullable items within a batch".

Logically I'm totally onboard, though there may be lurking consequences when it comes to how some range-operations are processed. This is probably fresher in @teh-cmc 's mind.

It could be worth thinking some more about both this and https://github.com/rerun-io/rerun/issues/6602 together in terms of simplifying the things we support.

Also very peripherally related: I remember nullabillity (along with unions) were one of the annoying things that we ran up against when we were previously using polars. A simplification that made it easier to guarantee Rerun data frame results can be loaded into polars without issues would also be a huge win.

emilk commented 3 months ago

Just to clarify a bit further, this would make it so that the inner values of arrow batches are never null. However the outer list-arrays itself would still have nullable arrays if an optional value wasn't provided in a row.

Good clarification. Just to be extra clear:

We currently support: Option<Vec<Option<C>>>
The proposal is to only support Option<Vec<C>>

teh-cmc commented 3 months ago

There are two distinct issues here: one is the presence of field-level nullability, another is dealing with that nullability efficiently.

We do make use of instance-level nullability, just not at the top layer (because of limitations in the flatbuffers IDL which have thankfully prevented us to do so).

If you look at any datatype that has optional fields, like e.g. the Material datatype:

/// Material properties of a mesh, e.g. its color multiplier.
struct Material (
  "attr.rust.derive": "Copy, PartialEq, Eq, Hash"
) {
  /// Optional color multiplier.
  albedo_factor: rerun.datatypes.Rgba32 (nullable, order: 100);
}

this will need to go through the *_opt paths at runtime since albedo_factor will end up being an array of optional Rgba32s (Vec<Option<Rgba32>>). (Well in that specific case, it will take yet another nullability-aware specialized path because Rgba32 is a native type, but the point stands: there is nullability to account for.)

I would love to get rid of nullability entirely, but I don't think that's realistic at the moment: we need the extra expressiveness in many instances, and relying on default values would be a massive waste of space and compute.

There might be a path to ultimately get there by making careful use of component conversions and redesigning our schemas to rely on more, smaller components instead of nullable fields, but I'm not sure how practical that would be in practice.

E.g. TranslationAndMat3x3 is an obvious candidate for component conversions instead of nullable:

table TranslationAndMat3x3 (
  "attr.rust.derive": "Copy, PartialEq"
) {
  translation: rerun.datatypes.Vec3D (nullable, order: 100);
  mat3x3: Mat3x3 (nullable, order: 200);
  from_parent: bool = false (order: 300);
}

E.g. TensorDimension is one of these examples where you'd need to split name in its own TensorDimensionName type instead of using a nullable field:

table TensorDimension (
  "attr.rust.derive_only": "Clone, Default, Eq, PartialEq"
) {
  size: ulong (order: 100);
  name: string (order: 200, nullable);
}

Now when it comes to dealing with this efficiently and reducing implementation complexity, I maintain that the problem is that we still have deserialization code that checks instead of assuming:

https://github.com/rerun-io/rerun/issues/5305

emilk commented 3 months ago

After some discussion, we all agree that we want to move away from inner nullability of components and datatypes.

The road to there requires changes in a few places:

rerun-io / rerun

Remove support for nullable components #6819

6829

6830