Closed avilella closed 3 years ago
Hi @avilella,
If I understood correctly, all you need to do is to create a dummy anndata
object where the indices of .obs
are matching to cell_ids
in your airr file. If Dandelion
doesn't initialise because there's no cell_ids
or any other of the required columns, then you can just make it up e.g. the cell_ids
can just be a unique barcode for each contig (like the sequence_id
) if you dealing with bulk-level data.
import scanpy as sc
import scipy.sparse
# assuming your filtered dandelion object is called vdj
obs = pd.DataFrame(index = vdj.metadata.index)
n = vdj.metadata.shape[0]
# just create a random matrix
adata = sc.AnnData(X = scipy.sparse.random(n, 100), obs = obs)
# this is just to populate the neighbors slot
sc.pp.neighbors(adata)
# then transfer
ddl.tl.transfer(adata, vdj)
Would it be possible to run the airr_rearrangement.tsv file based clustering and 3D network plotting without supplying a filtered h5 file?
If one only has data for the VDJ library and not the 5'GEX, this filtered h5 file wouldn't exist, yet the airr_rearrangement.tsv file for the VDJ data is available. See below the steps that currently work for both: