Open Ngort opened 1 year ago
The problem is that AnnData cannot deal with None
values in obs
.
A minimal repex is
import anndata
import pandas as pd
import numpy as np
adata = anndata.AnnData(X=None, obs=pd.DataFrame().assign(test=np.array([1, 2, None, 3])))
adata.write_h5ad("test.h5ad")
In principle, AnnData supports nullable Integers and Booleans, but not Strings (see https://github.com/scverse/anndata/issues/679, https://github.com/scverse/anndata/issues/504). However, nullable here means a pandas BooleanArray or IntegerArray, not an object
dtype with None
s.
As a workaround, the offending columns can be converted to a pandas array, e.g.
mdata['tcr'].obs["VJ_1_consensus_count"] = pd.array(mdata['tcr'].obs["VJ_1_consensus_count"].values)
We obviously need a better solution than this. I'll check if this should be solved on the AnnData side e.g. by an automatic conversion. Otherwise the scirpy.get.airr
function could deal with that.
some progress on anndata https://github.com/scverse/anndata/pull/1558
Still need to check if this can be closed now.
Still need to check if this can be closed now.
Unfortunately not.
Describe the bug
Can't save h5mu from Scirpy processed gex+bcr+tcr data if I copy airr into obs (i.e.
tdata.obs = tdata.obs.join(ir.get.airr(tdata, tdata.obsm['airr'].fields))
). Unlike in #427 , I am on 0.13 and still suffer from the bug.(it does this with many other columns, including all _call, _cigar columns)
To Reproduce
What else I've tried Changing columns to categoricals
Expected behaviour Save the file without problems
System
OS: Linux Python version 3.9.16 Versions of libraries involved [Muon 0.1.5, Scirpy 0.13.0, Scanpy 1.9.3]
Additional context Add any other context about the problem here.