spiraldb / vortex

An extensible, state-of-the-art columnar file format
https://vortex.dev
Apache License 2.0
961 stars 24 forks source link

LayoutWriter doesn't respect validity of the top level Struct array #710

Open robert3005 opened 1 month ago

robert3005 commented 1 month ago

Right now if StructArray has top level validity that's not AllValid or NonNullable the validity is discarded

doki23 commented 3 weeks ago

Could you please provide more details? I'm interested in this issue and would like to attempt to resolve it.

robert3005 commented 2 weeks ago

This is likely not a good first issue. The solution here is likely to either forbid writing top level structs with validity other than nonnullable and provide a function to push validity of the struct to the children, i.e. run a filter. Alternatively we need to teach column layout about one extra validity array which shouldn't be bad but likely unnecessary