theislab / zellkonverter

Conversion between scRNA-seq objects
https://theislab.github.io/zellkonverter/
Other
144 stars 27 forks source link

Error in if (ncol(m)) { : argument is of length zero #113

Closed spatts14 closed 1 month ago

spatts14 commented 5 months ago

Hi! I commented on an enhancement in issues as I originally thought it might be an issue of my spatial experiment, however I'm wondering if this is a separate issue.

I tried to convert the spe into a sce object, which worked. However, when I then tired to convert it into an anndata object, I got the error below. This also happened when I tired to go directly from spe --> anndata object. I also tried both writeH5AD and SCE2AnnData, and got the error function for all of them.

It doesn't seem like anything is wrong with me spe object as I can visualize a variety of outputs in R. Apologies in advance if this is a silly question - I am still very new to CS/DS/bioinformatics etc.

# Import data
spe <-readRDS( "path/spe_backup.rds")

# look at spe (SpatialExperiment)
class(spe) 
str(spe)

# Convert spe to sce
sce <- as(spe, "SingleCellExperiment")

# Convert Anndata object
writeH5AD(sce, file = out_path)

# Convert to Anndata object
adata <- SCE2AnnData(sce)

I got an error of the following trying both approaches

ℹ Using the 'counts' assay as the X matrix
Error in if (ncol(m)) { : argument is of length zero

I also tried looking in the source code of both functions to try to see where things were going wrong but I could not figure it out.

Thanks for your help!

lazappi commented 5 months ago

Hi @spatts14

That does seem like something that should be checked. Can you confirm which dataset you are trying here? I wasn't able to work it out from the links you sent in #61. Even better would be if you could share the .rds file you have here if it isn't too big.

Could you also please check what is returned by counts(sce)? It's hard to say without access to the data but my guess is that is empty for some reason.

spatts14 commented 5 months ago

Hi @lazappi,

I was trying a dataset in the group, but also tried the dataset that I had shared in #61. I've checked both the dataset from our group as well as the Bodenmiller dataset. Both are non-empty.

I have also tried to add the .RDS files, but it says it does not support the file type. I am happy to send them via email or some other method that suits you best.

sce <- as(spe, "SingleCellExperiment")
lazappi commented 5 months ago

If you could put them on some kind of file sharing system (Google Drive, OneDrive, Dropbox etc.) and put a link here that would be great.

Or if you could share the minimal code required to go from the raw Bodenmiller data to the object you are testing that would also be helpful (as long as it's not too complex).

spatts14 commented 5 months ago

Hi @lazappi

Please let me know if you can't access it https://drive.google.com/drive/folders/1CuXBWQUPI9hvPxOcl2gmBgHwWRiWo01y?usp=drive_link

lazappi commented 5 months ago

I wasn't able to access it currently. I sent an access request but it might be easier to make it public if there is nothing confidential in the data.

spatts14 commented 5 months ago

Apologies, I thought it was public. Should be public now - please let me know if there is any issues.

lazappi commented 5 months ago

Thanks for the data! That was super helpful for working out what the issue was.

I have tracked it down to the colPairs() function in the {SingleCellExperiment} package and have opened an issue there https://github.com/drisso/SingleCellExperiment/issues/73. Depending on how they decide to handle it I may make changes to {zellkonverter}.

In the meantime you can avoid the error but doing something like:

mcols(colPairs(sce)[[1]])$value <- 1

NOTE: This will give you an obsp graph with all the edge weights set to 1 in the file .h5ad file.

spatts14 commented 5 months ago

Thanks so much! That worked for us!

Thank you for making the note about the change in edge weights being set to 1.

spatts14 commented 1 month ago

Hi,

We tired again with another dataset we recently generated and hit a similar/possibly the same issue. We also re-ran it with the original test dataset and got a different issue. We followed the issue to SingleCellExperiment thread and updated the package to the new release, however are still having problems. Could you please advise?

The new dataset (the one we actually care about converting)

> writeH5AD(sce, file = "/path/speM7.rds")
ℹ Using the 'counts' assay as the X matrix
Error in if (ncol(m)) { : argument is of length zero
In addition: Warning message:
The following colData columns are not atomic and will be stored in metadata(sce)$.colData before conversion: "ROI"
and "aggregatedNeighbors" 

Original dataset issue

> writeH5AD(sce, file = "/Users/path/spe_test.h5ad")
ℹ Using the 'counts' assay as the X matrix
Error in py_call_impl(callable, call_args$unnamed, call_args$named) :
TypeError: Can't implicitly convert non-string objects to strings

Above error raised while writing key 'layers' of <class 'h5py._hl.group.Group'> to /
Run `reticulate::py_last_error()` for details.

Here is the code I am using for both:

### Load libraries

library(zellkonverter)
library(SpatialExperiment)
library(SingleCellExperiment)

# Import data
spe <-readRDS( "/path/speM7.rds")

# look at spe
class(spe) # SpatialExperiment
str(spe)

# Convert spe to sce
sce <- as(spe, "SingleCellExperiment")

# Work around - ** We have tried with and without this workaround**
mcols(colPairs(sce)[[1]])$value <- 1

# Convert Anndata object
writeH5AD(sce, file = "/path/speM7.rds")
lazappi commented 1 month ago

Hi @spatts14

I think there could be two issues here but let's look at the original one first as I thought I had solved that already. It looks like the data is gone from your Google Drive, could you please reshare it? A small subset would be fine as long as it causes the same error.

Could you also please share the package versions you are using (from sessionInfo() or similar)?

spatts14 commented 1 month ago

Hi @lazappi

Thanks for getting back to us!

Here is the sessionInfo:

> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS 14.4.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SpatialExperiment_1.10.0    SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.2
 [4] Biobase_2.60.0              GenomicRanges_1.52.1        GenomeInfoDb_1.36.4        
 [7] IRanges_2.34.1              S4Vectors_0.38.2            BiocGenerics_0.46.0        
[10] MatrixGenerics_1.12.3       matrixStats_1.3.0           zellkonverter_1.10.1       

loaded via a namespace (and not attached):
 [1] dqrng_0.4.1               scuttle_1.10.3            bitops_1.0-7              lattice_0.22-6           
 [5] magrittr_2.0.3            sparseMatrixStats_1.12.2  grid_4.3.0                R.oo_1.26.0              
 [9] jsonlite_1.8.8            Matrix_1.6-5              R.utils_2.12.3            limma_3.56.2             
[13] BiocManager_1.30.23       HDF5Array_1.28.1          codetools_0.2-20          abind_1.4-5              
[17] cli_3.6.3                 rlang_1.1.4               crayon_1.5.3              XVector_0.40.0           
[21] basilisk.utils_1.12.1     R.methodsS3_1.8.2         DelayedArray_0.26.7       S4Arrays_1.0.6           
[25] tools_4.3.0               beachmat_2.16.0           dir.expiry_1.8.0          parallel_4.3.0           
[29] BiocParallel_1.34.2       Rhdf5lib_1.22.1           DropletUtils_1.20.0       locfit_1.5-9.10          
[33] GenomeInfoDbData_1.2.10   filelock_1.0.3            basilisk_1.12.1           reticulate_1.37.0        
[37] png_0.1-8                 magick_2.8.3              rhdf5_2.44.0              zlibbioc_1.46.0          
[41] edgeR_3.42.4              Rcpp_1.0.12               rstudioapi_0.16.0         rhdf5filters_1.12.1      
[45] rjson_0.2.21              DelayedMatrixStats_1.22.6 compiler_4.3.0            RCurl_1.98-1.16          
> 

I will reshare the data on google drive of a different email address. Here is the link: https://drive.google.com/drive/folders/1ZrOViCQIVZFL8mIr4SbJ_OsvMZamlG_I?usp=drive_link

It does work on this publicly available test data, but when we try it on our data we hit the error listed above.

spatts14 commented 1 month ago

Solved the problem. You can only have 1 item in colPairs and we had 4 items in there as we ran different several interaction graph algorithms.