single-cell-data / TileDB-SOMA

Python and R SOMA APIs using TileDB’s cloud-native format. Ideal for single-cell data at any scale.
https://tiledbsoma.readthedocs.io
MIT License
90 stars 25 forks source link

[python] Core dump while testing 1.5.0rc0 #1704

Closed bkmartinjr closed 1 year ago

bkmartinjr commented 1 year ago

This is likely related to #1703 as it triggers the same underlying weirdness.

The following code reliably core dumps upon exit, suggesting heap corruption. Beyond #1703, I don't have any evidence that the root cause is TileDB-SOMA (vs say AnnData or some other package).

However, it only occurs when the aforementioned bug is triggered.

import tiledbsoma as soma

def main():
    ctx = soma.SOMATileDBContext(
        tiledb_config={
            "vfs.s3.region": "us-west-2",
            "soma.init_buffer_bytes": 4 * 1024**3,
        }
    )

    with soma.open(
        "s3://cellxgene-data-public/cell-census/2023-09-14/soma/", context=ctx
    ) as census:
        mouse_experiment: soma.Experiment = census["census_data"]["mus_musculus"]

        with mouse_experiment.axis_query(
            measurement_name="RNA",
            obs_query=soma.AxisQuery(
                coords=(slice(0, 100),),
                value_filter='tissue=="aorta"',
            ),
        ) as query:
            ad = query.to_anndata(X_name="raw")

    print(ad)

if __name__ == "__main__":
    main()

When I run it, I see:

$ python core.py
AnnData object with n_obs × n_vars = 0 × 52417
    obs: 'soma_joinid', 'dataset_id', 'assay', 'assay_ontology_term_id', 'cell_type', 'cell_type_ontology_term_id', 'development_stage', 'development_stage_ontology_term_id', 'disease', 'disease_ontology_term_id', 'donor_id', 'is_primary_data', 'self_reported_ethnicity', 'self_reported_ethnicity_ontology_term_id', 'sex', 'sex_ontology_term_id', 'suspension_type', 'tissue', 'tissue_ontology_term_id', 'tissue_general', 'tissue_general_ontology_term_id', 'raw_sum', 'nnz', 'raw_mean_nnz', 'raw_variance_nnz', 'n_measured_vars'
    var: 'soma_joinid', 'feature_id', 'feature_name', 'feature_length', 'nnz', 'n_measured_obs'
free(): invalid next size (fast)
Aborted (core dumped)
johnkerl commented 1 year ago

I've verified that this will be fixed by #1703.