[python] Utilize Arrow schema `pa.field` nullabilities in `DataFrame.create`

Context

Split out from #2858.

This issue is for TileDB-SOMA Python only. The R situation will be triaged separately, and tasked out (from #2858) if necessary.

Tracking

In tiledbsoma.DataFrame.create, and likewise tiledbsoma.Experiment.add_new_dataframe, the user brings their own Arrow schema. Our task is to respect that as much as possible, and translate that into a TileDB core schema. One of the things to be mapped across is attribute-level nullability.

Unfortunately, there are two different ways for attributes to be marked nullable:

(1) Flags on the attribute (2) Metadata for the attribute

Python example:

import pyarrow as pa

schema1 = pa.schema(
    [
        pa.field("a", pa.int32()),
        pa.field("b", pa.int32(), nullable=False),
        pa.field("c", pa.int32(), nullable=True)
    ]
)
print("SCHEMA1")
print(schema1)

schema2 = pa.schema(
    [
        pa.field("d", pa.int32()),
        pa.field("e", pa.int32(), nullable=False),
        pa.field("f", pa.int32(), nullable=True),
    ],
    metadata={"d": "nullable", "e": "nullable", "f": "nullable"}
)
print()
print("SCHEMA2")
print(schema2)

Output:

$ python arrow-schema-examples.py
SCHEMA1
a: int32
b: int32 not null
c: int32

SCHEMA2
d: int32
e: int32 not null
f: int32
-- schema metadata --
d: 'nullable'
e: 'nullable'
f: 'nullable'

Note that pa.field defaults to nullable: here, fields a and c are both nullable; only b is not. This is indicated by b: int32 not null.

Bug

In our current implementation we make attributes non-nullable only if the metadata option is set.

Fix

Arrow fields are nullable by default. So we must make them non-nullable only when this is explicit in the schema flags.
If there is no metadata, we simply respect the user's schema including its per-attribute nullability flags.
If there is set-nullable metadata for an attribute, it must override the nullability flags for that attribute.

single-cell-data / TileDB-SOMA

[python] Utilize Arrow schema `pa.field` nullabilities in `DataFrame.create` #2869

Context

Tracking

Bug

Fix