TileDB-Inc / tiledbsc

Single-cell data structures in TileDB
https://tiledb-inc.github.io/tiledbsc/
Other
14 stars 3 forks source link

`Error in SummarizedExperiment::SummarizedExperiment(assays = assay_mats, : the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the SummarizedExperiment object (or derivative) to construct` #96

Open multimeric opened 1 year ago

multimeric commented 1 year ago

I have a SOMA object that I created from AnnData. It seems (?) to be valid:

> soma$obs$to_dataframe() |> rownames() |> head()
Reading AnnotationDataframe into memory from 'file:///vast/scratch/users/milton.m/soma/obs'
[1] "exp1-human-102.TAATCAGCTTTCGGCCTTAC" "exp1-human-103.CGATTCGCTCGGCGTAACT"
[3] "exp1-human-105.TGTCCTTATTACCTCTATCT" "exp1-human-107.ATCAGTCATTGCTAACTTGC"
[5] "exp1-human-110.TTCTGGCCTGGCTCTCTAT"  "exp1-human-113.GTAGCGATTTGAGTCTGGC"
> soma$var$to_dataframe() |> rownames() |> head()
Reading AnnotationDataframe into memory from 'file:///vast/scratch/users/milton.m/soma/var'
[1] "5S_rRNA_ENSG00000276861" "5S_rRNA_ENSG00000277411"
[3] "5S_rRNA_ENSG00000277488" "5S_rRNA_ENSG00000285609"
[5] "5S_rRNA_ENSG00000285626" "5S_rRNA_ENSG00000285674"
> soma$X$get_member("data")$to_matrix() |> dim()
Reading AssayMatrix into memory from 'file:///vast/scratch/users/milton.m/soma/X/data'
[1]  1535 17832
> soma$X$get_member("data")$to_matrix() |> rownames() |> head()
Reading AssayMatrix into memory from 'file:///vast/scratch/users/milton.m/soma/X/data'
[1] "exp1-human-102.TAATCAGCTTTCGGCCTTAC" "exp1-human-103.CGATTCGCTCGGCGTAACT"
[3] "exp1-human-105.TGTCCTTATTACCTCTATCT" "exp1-human-107.ATCAGTCATTGCTAACTTGC"
[5] "exp1-human-110.TTCTGGCCTGGCTCTCTAT"  "exp1-human-113.GTAGCGATTTGAGTCTGGC"
> soma$X$get_member("data")$to_matrix() |> colnames() |> head()
Reading AssayMatrix into memory from 'file:///vast/scratch/users/milton.m/soma/X/data'
[1] "ABHD2"  "ABL1"   "ADAM10" "ADAM12" "ADAM19" "ADAM9"

However, when converting to SingleCellExperiment (or SummarizedExperiment, it fails:

> sce = soma$to_single_cell_experiment()
Reading AssayMatrix into memory from 'file:///vast/scratch/users/milton.m/soma/X/data'
Reading AnnotationDataframe into memory from 'file:///vast/scratch/users/milton.m/soma/obs'
Reading AnnotationDataframe into memory from 'file:///vast/scratch/users/milton.m/soma/var'
Error in SummarizedExperiment::SummarizedExperiment(assays = assay_mats,  :
  the rownames and colnames of the supplied assay(s) must be NULL or
  identical to those of the SummarizedExperiment object (or derivative)
  to construct
> se = soma$to_summarized_experiment()
Reading AssayMatrix into memory from 'file:///vast/scratch/users/milton.m/soma/X/data'
Reading AnnotationDataframe into memory from 'file:///vast/scratch/users/milton.m/soma/obs'
Reading AnnotationDataframe into memory from 'file:///vast/scratch/users/milton.m/soma/var'
Error in SummarizedExperiment::SummarizedExperiment(assays = assay_mats,  :
  the rownames and colnames of the supplied assay(s) must be NULL or
  identical to those of the SummarizedExperiment object (or derivative)
  to construct

Any ideas what is happening here? My feeling is there is just a small bug in the to_summarized_experiment() method.

aaronwolen commented 1 year ago

I believe this error indicates that the order of either the sample or feature metadata table doesn't align with the assay matrix. Does your dataset contain multiple assays (i.e., is there more than 1 array within /vast/scratch/users/milton.m/soma/X)?

Would it be possible to share the data you used so we can attempt to reproduce?

multimeric commented 1 year ago

It contains only one assay, but conceivably the order of labels are wrong between two different sources. If this is the case, then it's because the Python anndata -> SOMA conversion is messing with the order somehow. What command should I run to try to diagnose this? I think sending the data would be difficult at this point.