kieranrcampbell / ouija

Descriptive probabilistic marker gene approach to single-cell pseudotime inference
http://kieranrcampbell.github.io/ouija
28 stars 3 forks source link

Can't load sce data #11

Open JihedC opened 5 years ago

JihedC commented 5 years ago

Hi, I am new to Ouija but quite excite about using it. I have got my Seurat object transformed into a singleCellExperiment object. Unfortunately Ouija doesn't seem to see like this. I get this :

oui <- ouija(sample1.sce)
Error in t.default(x@assays[[single_cell_experiment_assay]]) : 
  argument is not a matrix

Am I doing something wrong?

kieranrcampbell commented 4 years ago

Hi Jihed

Did you get this working? I’m currently trying to debug but there are issues using Stan on MacOS Catalina that I’m looking into

On Fri, Nov 29, 2019 at 09:36 Jihed Chouaref notifications@github.com wrote:

Closed #11 https://github.com/kieranrcampbell/ouija/issues/11.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kieranrcampbell/ouija/issues/11?email_source=notifications&email_token=AAPR5QIVNDMKRL7L3GB2NP3QWDPAZA5CNFSM4JR2WIB2YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOVFJ7MYQ#event-2840852066, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPR5QPLLJ6KDWK65OUFLMLQWDPAZANCNFSM4JR2WIBQ .

JihedC commented 4 years ago

Hi Kieran! Thanks for your reply! No I didn't manage to load the singlecellexperiment I created via Seurat. I tried to work around it by importing a matrix from my Seurat object.

expression <- as.matrix(sample@assays$RNA@data)
test <- expression[1:10,1:100]
transposed<-t(test)
test <- transposed +1
normalized_test <- log(test)

oui <- ouija(test)
plot_expression(oui)

But I have no clue if this is okay to do so. I have got this plot:

Capture d’écran 2019-11-29 à 11 53 00
kieranrcampbell commented 4 years ago

Hmm looks like the sampler hasn't run at all. Let me debug over the next few days and see where I get. I suspect various dependencies have been updated that have broken the main package.

JihedC commented 4 years ago

ok thanks! I have also tried to run your example with Singlecellexperiment:

library(ouija)
data(synth_gex)
single_cell_set <- SingleCellExperiment(assays = list(exprs = t(example_gex)))
ouija(single_cell_set)

Which also gives the same output :

Error in t.default(x@assays[[single_cell_experiment_assay]]) : 
  argument is not a matrix
kieranrcampbell commented 4 years ago

Ok that error should now be fixed in c77e652b625cc4f385fc866c01b93cd9a8a4ca65 -- can you reinstall from github and try again? Thanks

JihedC commented 4 years ago

Yep I'll have a look

JihedC commented 4 years ago

Ok that error should now be fixed in c77e652 -- can you reinstall from github and try again? Thanks

Nope I still get the same error message and the command :

data(synth_gex)

doesn't work anymore :(

kieranrcampbell commented 4 years ago

Okay at some point synth_gex became example_gex -- can you try that? I've updated doc in latest commit

JihedC commented 4 years ago

This is what I've got. Sorry it's in French, it means that the argument is not a matrix :(

> data(example_gex)
> single_cell_set <- SingleCellExperiment(assays = list(exprs = t(example_gex)))
> ouija(single_cell_set)
Error in t.default(x@assays[[single_cell_experiment_assay]]) : 
  l'argument n'est pas une matrice
> 
kieranrcampbell commented 4 years ago

Ah the assay should be "logcounts" or the single_cell_experiment_assay changed to "logcounts" -- I've updated the vignette to reflect this, my fault for not changing sooner. Also it looks like you haven't reinstalled from github since I just changed the code to remove the line x@assays... -- could you reinstall using devtools::install_github and try again? Thanks

JihedC commented 4 years ago

I reinstalled earlier using this

devtools::install_github("kieranrcampbell/ouija")

I'll try again

JihedC commented 4 years ago

Ah the assay should be "logcounts" or the single_cell_experiment_assay changed to "logcounts" -- I've updated the vignette to reflect this, my fault for not changing sooner. Also it looks like you haven't reinstalled from github since I just changed the code to remove the line x@assays... -- could you reinstall using devtools::install_github and try again? Thanks

I am afraid I am not understanding what you mean. Which command should I use to load the ice assay?

By the way these are my session info (concerning the installation):

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS  10.14.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /anaconda3/envs/seurat/lib/R/lib/libRlapack.dylib

locale:
[1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ouija_0.99.0                sctransform_0.2.0           rstan_2.19.2                ggplot2_3.2.1              
 [5] StanHeaders_2.19.0          Rcpp_1.0.3                  Seurat_3.1.1                knitr_1.26                 
 [9] SingleCellExperiment_1.4.1  SummarizedExperiment_1.12.0 DelayedArray_0.8.0          BiocParallel_1.16.6        
[13] matrixStats_0.55.0          Biobase_2.42.0              GenomicRanges_1.34.0        GenomeInfoDb_1.18.2        
[17] IRanges_2.16.0              S4Vectors_0.20.1            BiocGenerics_0.28.0   
kieranrcampbell commented 4 years ago
single_cell_set <- SingleCellExperiment(assays = list(logcounts = t(example_gex)))
ouija(single_cell_set)

should work ?

I've also bumped the version number so if you get the same error as before (Error in t.default(x@assays[[single_cell_experiment_assay]])) can you reinstall and ensure version number is 0.99.1 ? Thanks

JihedC commented 4 years ago

I am sorry for bringing bad news to you today :(

R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS  10.14.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /anaconda3/envs/seurat/lib/R/lib/libRlapack.dylib

locale:
[1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ouija_0.99.1                sctransform_0.2.0           rstan_2.19.2                ggplot2_3.2.1              
 [5] StanHeaders_2.19.0          Rcpp_1.0.3                  Seurat_3.1.1                knitr_1.26                 
 [9] SingleCellExperiment_1.4.1  SummarizedExperiment_1.12.0 DelayedArray_0.8.0          BiocParallel_1.16.6        
[13] matrixStats_0.55.0          Biobase_2.42.0              GenomicRanges_1.34.0        GenomeInfoDb_1.18.2        
[17] IRanges_2.16.0              S4Vectors_0.20.1            BiocGenerics_0.28.0        

But there is another error now :(

> ouija(single_cell_set)
Erreur : lazy-load database '/anaconda3/envs/seurat/lib/R/library/ouija/R/ouija.rdb' is corrupt
De plus : Warning message:
internal error -3 in R_decompress1 
kieranrcampbell commented 4 years ago

No problem! Can you restart R (or terminate session in Rstudio) and try again? That usually resolves this error.

JihedC commented 4 years ago

Great it works!

I have one more question, if you can help me, it would be great!

I have managed to reproduce your example using the data(example_gex) but I still get the error when I load my own SCE object I have got from Seurat with as.SingleCellExperiment. If I load my rds file in R and then run ouija I still get the error :

Error in t.default(assay(x, single_cell_experiment_assay)) : 
  l'argument n'est pas une matrice 

This is what I tried with my own SCE object :

test <- readRDS(file = "SCE_thymo_cells_cycle_regressed.rds")
test_oui <- ouija(test)
Error in t.default(assay(x, single_cell_experiment_assay)) : 
  l'argument n'est pas une matrice
> str(test)
Formal class 'SingleCellExperiment' [package "SingleCellExperiment"] with 10 slots
  ..@ int_elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
  .. .. ..@ rownames       : NULL
  .. .. ..@ nrows          : int 17535
  .. .. ..@ listData       : Named list()
  .. .. ..@ elementType    : chr "ANY"
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ int_colData        :Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
  .. .. ..@ rownames       : NULL
  .. .. ..@ nrows          : int 8981
  .. .. ..@ listData       : Named list()
  .. .. ..@ elementType    : chr "ANY"
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ int_metadata       :List of 3
  .. ..$ version          :Classes 'package_version', 'numeric_version'  hidden list of 1
  .. .. ..$ : int [1:3] 1 4 1
  .. ..$ spike_names      : chr(0) 
  .. ..$ size_factor_names: chr(0) 
  ..@ reducedDims        :Formal class 'SimpleList' [package "S4Vectors"] with 4 slots
  .. .. ..@ listData       :List of 3
  .. .. .. ..$ PCA : num [1:8981, 1:50] -11.04 -3.48 -10.32 -10.87 -4.52 ...
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:8981] "AAAAATTCTTAA" "GTAATTTAACGA" "TACAAGTTGATG" "TCAGCAATGACN" ...
  .. .. .. .. .. ..$ : chr [1:50] "PC_1" "PC_2" "PC_3" "PC_4" ...
  .. .. .. ..$ UMAP: num [1:8981, 1:2] -1.338 -2.712 2.348 -1.547 -0.965 ...
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:8981] "AAAAATTCTTAA" "GTAATTTAACGA" "TACAAGTTGATG" "TCAGCAATGACN" ...
  .. .. .. .. .. ..$ : chr [1:2] "UMAP_1" "UMAP_2"
  .. .. .. ..$ TSNE: num [1:8981, 1:2] 29.4 28.8 33.8 29.7 24.1 ...
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:8981] "AAAAATTCTTAA" "GTAATTTAACGA" "TACAAGTTGATG" "TCAGCAATGACN" ...
  .. .. .. .. .. ..$ : chr [1:2] "tSNE_1" "tSNE_2"
  .. .. ..@ elementType    : chr "ANY"
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ rowRanges          :Formal class 'CompressedGRangesList' [package "GenomicRanges"] with 5 slots
  .. .. ..@ unlistData     :Formal class 'GRanges' [package "GenomicRanges"] with 7 slots
  .. .. .. .. ..@ seqnames       :Formal class 'Rle' [package "S4Vectors"] with 4 slots
  .. .. .. .. .. .. ..@ values         : Factor w/ 0 levels: 
  .. .. .. .. .. .. ..@ lengths        : int(0) 
  .. .. .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. .. .. ..@ metadata       : list()
  .. .. .. .. ..@ ranges         :Formal class 'IRanges' [package "IRanges"] with 6 slots
  .. .. .. .. .. .. ..@ start          : int(0) 
  .. .. .. .. .. .. ..@ width          : int(0) 
  .. .. .. .. .. .. ..@ NAMES          : NULL
  .. .. .. .. .. .. ..@ elementType    : chr "ANY"
  .. .. .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. .. .. ..@ metadata       : list()
  .. .. .. .. ..@ strand         :Formal class 'Rle' [package "S4Vectors"] with 4 slots
  .. .. .. .. .. .. ..@ values         : Factor w/ 3 levels "+","-","*": 
  .. .. .. .. .. .. ..@ lengths        : int(0) 
  .. .. .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. .. .. ..@ metadata       : list()
  .. .. .. .. ..@ seqinfo        :Formal class 'Seqinfo' [package "GenomeInfoDb"] with 4 slots
  .. .. .. .. .. .. ..@ seqnames   : chr(0) 
  .. .. .. .. .. .. ..@ seqlengths : int(0) 
  .. .. .. .. .. .. ..@ is_circular: logi(0) 
  .. .. .. .. .. .. ..@ genome     : chr(0) 
  .. .. .. .. ..@ elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
  .. .. .. .. .. .. ..@ rownames       : NULL
  .. .. .. .. .. .. ..@ nrows          : int 0
  .. .. .. .. .. .. ..@ listData       : Named list()
  .. .. .. .. .. .. ..@ elementType    : chr "ANY"
  .. .. .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. .. .. ..@ metadata       : list()
  .. .. .. .. ..@ elementType    : chr "ANY"
  .. .. .. .. ..@ metadata       : list()
  .. .. ..@ elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
  .. .. .. .. ..@ rownames       : NULL
  .. .. .. .. ..@ nrows          : int 17535
  .. .. .. .. ..@ listData       :List of 5
  .. .. .. .. .. ..$ vst.mean                 : num [1:17535] 0.477 0.259 0.02 0.124 0.286 ...
  .. .. .. .. .. ..$ vst.variance             : num [1:17535] 0.6827 0.328 0.0224 0.1403 0.4018 ...
  .. .. .. .. .. ..$ vst.variance.expected    : num [1:17535] 0.697 0.3385 0.0242 0.1497 0.3788 ...
  .. .. .. .. .. ..$ vst.variance.standardized: num [1:17535] 0.979 0.969 0.924 0.937 1.061 ...
  .. .. .. .. .. ..$ vst.variable             : logi [1:17535] FALSE FALSE FALSE FALSE FALSE FALSE ...
  .. .. .. .. ..@ elementType    : chr "ANY"
  .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. ..@ metadata       : list()
  .. .. ..@ elementType    : chr "GRanges"
  .. .. ..@ metadata       : list()
  .. .. ..@ partitioning   :Formal class 'PartitioningByEnd' [package "IRanges"] with 5 slots
  .. .. .. .. ..@ end            : int [1:17535] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. .. .. ..@ NAMES          : chr [1:17535] "0610007P14Rik" "0610009B22Rik" "0610009L18Rik" "0610009O20Rik" ...
  .. .. .. .. ..@ elementType    : chr "ANY"
  .. .. .. .. ..@ elementMetadata: NULL
  .. .. .. .. ..@ metadata       : list()
  ..@ colData            :Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
  .. .. ..@ rownames       : chr [1:8981] "AAAAATTCTTAA" "GTAATTTAACGA" "TACAAGTTGATG" "TCAGCAATGACN" ...
  .. .. ..@ nrows          : int 8981
  .. .. ..@ listData       :List of 14
  .. .. .. ..$ orig.ident      : Factor w/ 1 level "full_atlas": 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..$ nCount_RNA      : num [1:8981] 3825 1861 3117 9331 2494 ...
  .. .. .. ..$ nFeature_RNA    : int [1:8981] 1946 1177 1331 3505 1524 2830 2844 3404 1953 1447 ...
  .. .. .. ..$ percent.mt      : num [1:8981] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. .. ..$ RNA_snn_res.0.5 : Factor w/ 8 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..$ seurat_clusters : Factor w/ 8 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..$ percent.rp      : num [1:8981] 9.961 12.52 0.738 8.606 8.621 ...
  .. .. .. ..$ RNA_snn_res.0.25: Factor w/ 7 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..$ S.Score         : num [1:8981] -0.01811 -0.11624 0.00729 0.08427 -0.04092 ...
  .. .. .. ..$ G2M.Score       : num [1:8981] 0.18012 -0.000609 0.048107 0.054149 -0.026457 ...
  .. .. .. ..$ Phase           : Factor w/ 3 levels "G1","G2M","S": 2 1 2 3 1 3 2 1 3 2 ...
  .. .. .. ..$ old.ident       : Factor w/ 7 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..$ label           : Factor w/ 7 levels "Tconv1","Tconv3",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..$ ident           : Factor w/ 5 levels "Tconv1","Tconv3",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..@ elementType    : chr "ANY"
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ assays             :Reference class 'ShallowSimpleListAssays' [package "SummarizedExperiment"] with 1 field
  .. ..$ data: NULL
  .. ..and 14 methods.
  ..@ NAMES              : NULL
  ..@ elementMetadata    :Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
  .. .. ..@ rownames       : NULL
  .. .. ..@ nrows          : int 17535
  .. .. ..@ listData       : Named list()
  .. .. ..@ elementType    : chr "ANY"
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ metadata           : list()

Thanks in advance!

kieranrcampbell commented 4 years ago

Aha! So if you print assays(test) what does it show? Ouija looks for an assay called logcounts (though you can change this with the single_cell_experiment_assay argument to the ouija function), so by default the SingleCellExperiment you pass must have an assay called logcounts

JihedC commented 4 years ago

Sorry for not answering earlier. assay(test) returns this :

Capture d’écran 2019-12-02 à 09 49 16 I am not so familiar with SCE object but this looks like a count matrix. Is there a problem with the conversion with Seurat ? I don't see any log counts in this data frame (I am just may be missing it )

kieranrcampbell commented 4 years ago

Aha! So just double check that names(assays(test)) returns counts only?

If so, you should be able to call

library(scran)
library(scater)
test <- computeSumFactors(test) # this computes size factors for each cell
test <- normalize(test) # compute log normalized counts

and double check that name(assays(test)) now has a logcounts

JihedC commented 4 years ago

Hi Kieran! Thanks for such a quick reply!

names(assays(test))
[1] "counts"    "logcounts"

The SCE contained already the log counts apparently.

I did run the command you suggested anyway and tried to use ouija like this:

library(ouija)
test_oui <- ouija(test, single_cell_experiment_assay = "logcounts")

Error in t.default(assay(x, single_cell_experiment_assay)) : l'argument n'est pas une matrice

But it still tells me that the argument is not a matrix :(

kieranrcampbell commented 4 years ago

Ohhh I think it's because it's saved as a sparse matrix....Ouija was written baack in the days when these weren't a thing. Does

logcounts(test) <- as.matrix(logcounts(test))

fix it?

JihedC commented 4 years ago

ah ok! So I tried :

logcounts(test) <- as.matrix(logcounts(test))
test_oui <- ouija(test, single_cell_experiment_assay = "logcounts")

and I've got this :

Error in svd(x, nu = 0, nv = k) : infinite or missing values in 'x'
kieranrcampbell commented 4 years ago

What does summary(colSums(logcounts(test))) look like?

JihedC commented 4 years ago
summary(colSums(logcounts(test)))

Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   2041    2662    2916    2914    3161    3974 
kieranrcampbell commented 4 years ago

Ok I'm stumped -- would you be able to saveRDS(test,...) and email it to me and I'll take a look?

JihedC commented 4 years ago

yes sure! What is the issue with it ?

JihedC commented 4 years ago

I have sent it to you via dropbox on the gmail address I found on your cv (I don't want to display it here)

JihedC commented 4 years ago

I have tried to work around it a bit by extracting the log count matrix of my genes of interest and adding 0.000001. I have also selected randomly 500 cells among the almost 10000 cells I have in my matrix. With this matrix the function ouija() worked but I am puzzled by the result I have got :

Capture d’écran 2019-12-02 à 12 10 19

I don't understand why I am getting such a high values on the y-axis, on your example the values are between 0 and 1.5. Besides in terms of biology, the pseudo-time line should be reverted Bcl11b is a gene that I am expecting expressed "late" in the pseudo time.

kieranrcampbell commented 4 years ago

Hi @JihedC

Do you have a list of marker genes? I've had a look at the test object but it appears transcriptome wide

JihedC commented 4 years ago

Hi @kieranrcampbell , Here is the list of genes:

genes = c("Flt3", "Kit", "Spi1", "Gata3", "Runx1", "Tcf7", "Bcl11b", "Rag1", "Cd4", "Cd8b1", "Cd8a")