drieslab / Giotto

Spatial omics analysis toolbox
https://drieslab.github.io/Giotto_website/
Other
255 stars 98 forks source link

Error while running runDWLSDeconv #170

Open PerrineLacour opened 2 years ago

PerrineLacour commented 2 years ago

Hello,

When running runDWLSDeconv, I get the following error

Error in enrich_matrix[, ct_spot_k[m]] : subscript out of bounds
In addition: Warning messages:
1: In sparseMatrix(i = indices[] + 1, p = indptr[], x = as.numeric(x = counts[]),  :
  'giveCsparse' has been deprecated; setting 'repr = "T"' for you
2: In cbind(allCounts_DWLS, solDWLS) :
  number of rows of result is not a multiple of vector length (arg 2)

I created the object following the visium mouse brain vignette, and the reference matrix was created with makeSignMatrixDWLS. I am using Giotto_1.0.4. The same reference works for other datasets with similar size and the deconvolution works for this dataset if I use another reference or change the clusters. I saw that another issue reported the same problem but it was closed without a solution.

Do you know how to fix that problem?

Best,

moutazhelal commented 2 years ago

Hi @PerrineLacour ,

I also had an error running the same function but my error was

Error in quadprog::solve.QP(D/sc, d/sc, A, bzero) : NA/NaN/Inf in foreign function call (arg 1)

after running with these parameters

> library(quadprog)
> Giotto_NDLN = runDWLSDeconv(
+   Giotto_NDLN,
+   expression_values = c("normalized"),
+   logbase = 2,
+   cluster_column = "leiden_clus",
+   sign_matrix= signature_matrix,
+   n_cell = 10,
+   cutoff = 2,
+   name = NULL,
+   return_gobject = TRUE)

did you find a solution for this,

Best Moutaz

RubD commented 2 years ago

I thought we had figured this out, but I'm including @gcyuan to the conversation as his team might be able to help out.

PerrineLacour commented 2 years ago

Hi,

No unfortunately I could not find a solution, I am working around it for now (by combining information from other samples).

Thank you! Let me know if you need more information on the issue.

Best,

RubD commented 2 years ago

Hi @PerrineLacour ,

I'm sorry to hear that this issue was not yet resolved.

@PerrineLacour or @moutazhelal Would it be possible to share - ideally a small - reproducible example that leads to this error? The information can also be de-identified and shared via email if that helps.

Thanks

PerrineLacour commented 2 years ago

Hi @RubD ,

I can share some de-identified data for a reproducible example. Could you share an email address I can send this data to?

Best,

RubD commented 2 years ago

Can you share it with rdries@bu.edu and joselynchavezf@gmail.com?

Thanks, Ruben

gcyuan commented 2 years ago

Could you let us know which version of Giotto you used that had this problem?

On Feb 17, 2022, at 9:17 AM, Ruben Dries @.***> wrote:

 I thought we had figured this out, but I'm including @gcyuan to the conversation as his team might be able to help out.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

weifangliu commented 2 years ago

Did you figure out what caused the problem? I encountered similar errors when running DWLS. My input ST data only has a few dozen genes. Would that be a problem? I'm using Giotto suite 2.0.0.997.

Error in quadprog::solve.QP(Dmat = D, dvec = d, Amat = A, bvec = bzero) :
matrix D in quadratic function is not positive definite!

Hi @PerrineLacour ,

I also had an error running the same function but my error was

Error in quadprog::solve.QP(D/sc, d/sc, A, bzero) : NA/NaN/Inf in foreign function call (arg 1)

after running with these parameters

> library(quadprog)
> Giotto_NDLN = runDWLSDeconv(
+   Giotto_NDLN,
+   expression_values = c("normalized"),
+   logbase = 2,
+   cluster_column = "leiden_clus",
+   sign_matrix= signature_matrix,
+   n_cell = 10,
+   cutoff = 2,
+   name = NULL,
+   return_gobject = TRUE)

did you find a solution for this,

Best Moutaz

leihouyeung commented 2 years ago

I also get the same issue. My Giotto version is 1.1.0.

Hello,

When running runDWLSDeconv, I get the following error

Error in enrich_matrix[, ct_spot_k[m]] : subscript out of bounds
In addition: Warning messages:
1: In sparseMatrix(i = indices[] + 1, p = indptr[], x = as.numeric(x = counts[]),  :
  'giveCsparse' has been deprecated; setting 'repr = "T"' for you
2: In cbind(allCounts_DWLS, solDWLS) :
  number of rows of result is not a multiple of vector length (arg 2)

I created the object following the visium mouse brain vignette, and the reference matrix was created with makeSignMatrixDWLS. I am using Giotto_1.0.4. The same reference works for other datasets with similar size and the deconvolution works for this dataset if I use another reference or change the clusters. I saw that another issue reported the same problem but it was closed without a solution.

Do you know how to fix that problem?

Best,

XuanCao-CX commented 2 years ago

Hi @moutazhelal and @leihouyeung,

To fix the problem you encountered, I can check the function 'runDWLSDeconv' if you guys would like to share your data including giotto object ‘Giotto_NDLN’ and ‘signature_matrix’.

Thanks, Xuan

leihouyeung commented 2 years ago

@XuanCao-CX Thanks for your help. The code is uploaded to Google Drive. Actually, I follow the code from here, because I did not find any tutorial on deconvolution in the Giotto documents. Could you please supply a de novo pipeline for deconvolution of spatial transcriptomics? Thanks a lot.

XuanCao-CX commented 2 years ago

Hi @leihouyeung,

Here is the pipeline to run deconvolution of spatial transcriptomics. To solve your problem, here I used seqFISH+ dataset as an example, you should run Part 1-9 of our online seqFISH+ pipeline (https://rubd.github.io/Giotto_site/articles/mouse_seqFISH_cortex_200914.html) and fellow this example to check the data format for sign_list and dwls_signature_matrix. Please keep in mind, single-cell resolution doesn't need deconvolution and this pipeline just used for users to easily understand the DWLS steps in Giotto.

Please let me know if there's any other problems.

Best, Xuan

`

Cell-type Enrichment

Setp1: PAGE

makePAGEsigMatrix

general cell types

clusters_cell_types_cortex = c('L6 eNeuron', 'L4 eNeuron', 'L2/3 eNeuron', 'L5 eNeuron', 'Lhx6 iNeuron', 'Adarb2 iNeuron', 'endothelial', 'mural', 'OPC','Olig', 'astrocytes', 'microglia')

names(clusters_cell_types_cortex) = c(1.1, 2.1, 3.1, 4.1, 5.1, 5.2, 6.1, 6.2, 7.1, 7.2, 8.1, 9.1)

create sign_list

cts = data.table::data.table(cluster = names(clusters_cell_types_cortex), cell_types=clusters_cell_types_cortex) scrna_markers_subclusters = data.table::merge.data.table(gini_markers_subclusters, cts, by='cluster')

scrna_markers = scrna_markers_subclusters[which(scrna_markers_subclusters$comb_rank <= 100)] clusters = unique(scrna_markers$cell_types) sign_list = lapply(1:length(clusters), FUN = function(x) { c1 = clusters[x] genes = scrna_markers[which(scrna_markers$cell_types == c1)]$genes return(genes) })

runPAGEEnrich

PAGEsignMatrix <- makeSignMatrixPAGE(sign_names = clusters, sign_list = sign_list) SS_seqfish <- runPAGEEnrich(gobject = SS_seqfish, sign_matrix = PAGEsignMatrix)

heatmap of enrichment versus annotation (e.g. clustering result)

cell_types = colnames(PAGEsignMatrix) plotMetaDataCellsHeatmap(gobject = SS_seqfish, metadata_cols = 'leiden_clus', value_cols = cell_types,spat_enr_names = 'PAGE', x_text_size = 8, y_text_size = 8, save_param = list(save_name="11_metaheatmap"))

cell_types_subset = colnames(PAGEsignMatrix) spatCellPlot(gobject = SS_seqfish, spat_enr_names = 'PAGE',cell_annotation_values = cell_types_subset, cow_n_col = 3,coord_fix_ratio = NULL, point_size = 0.75, save_param = list(save_name="11_spatcellplot"))

Step2: Spatial Cell-type Deconvolution

Sig_scrna <- unique(scrna_markers_subclusters$genes[which(scrna_markers_subclusters$comb_rank <= 100)]) sign_list <- Sig_scrna

dwls_signature_matrix <- makeSignMatrixDWLS(gobject = SS_seqfish, sign_gene = sign_list, cell_type_vector = SS_seqfish@cell_metadata$cell_types, expression_values = 'normalized')

SS_seqfish <- runDWLSDeconv(gobject = SS_seqfish, sign_matrix = dwls_signature_matrix) `

leihouyeung commented 2 years ago

Thanks so much for your reply, but do you have a more generalized pipeline for deconvolution? The deconvolution of seqFISH+ datasets seems a special case for simulations and in general, the deconvolution pipeline maybe not be so long like this. For example, I only have the expression profile of scRNA-seq and spatial transcriptomics (wth spots coordinates), and the annotation of scRNA-seq.

leihouyeung commented 2 years ago

A kind reminder. @XuanCao-CX

XuanCao-CX commented 2 years ago

Hi @leihouyeung,

I have already run the "runDWLSDeconv" using the data you provided both in issues #170 and #228 and I got the same error. We are currently fixing this bug. I will test your data again when this bug is fixed.

Best,

Xuan

leihouyeung commented 2 years ago

Hi @leihouyeung,

I have already run the "runDWLSDeconv" using the data you provided both in issues #170 and #228 and I got the same error. We are currently fixing this bug. I will test your data again when this bug is fixed.

Best,

Xuan

thanks so much for your help. If you have any updates, please tell me.

leihouyeung commented 2 years ago

@XuanCao-CX Hello, any update?

XuanCao-CX commented 2 years ago

Hi @leihouyeung,

The 'runDWLSDeconv' is available from master branch using you dataset(PDAC_GSM4100721.rds). Attached you will find my notebook running Giotto pipeline using you dataset.

Best, Xuan

giotto_issue_228_spatialDWLS.nb.html.zip

leihouyeung commented 2 years ago

@XuanCao-CX Thanks for your update. I have tried your code and I have the following issues:

  1. When you add the cell types annotation, here is an issue for the following codes: anno = data.table::data.table(cell_ID = sc_obj@cell_ID, cell_type = data$cell_type) sc_obj@cell_metadata = data.table::merge.data.table(sc_obj@cell_metadata, anno, by ='cell_ID') The issue is that Error in data.table::merge.data.table(sc_obj@cell_metadata, anno, by = "cell_ID"): Elements listed inbymust be valid column names in x and y And I fixed it by anno = data.table::data.table(cell_ID = sc_obj@cell_ID$cell, cell_type = data$cell_type) sc_obj@cell_metadata = data.table::merge.data.table(sc_obj@cell_metadata$cell$rna, anno, by ='cell_ID')

But when I do the leiden clustering and running deconvolution, the error showed like Error in runDWLSDeconv(gobject = st_obj, cluster_column = "leiden_clus", : cluster column not found

So I want to know your Giotto version. Maybe the version is the main issue.

  1. I found a new function in the final called 'runDWLSDeconv_V1' which is loaded from your own source code. Could you supply this to me?

Thanks so much for your help!

leihouyeung commented 2 years ago

@XuanCao-CX Kindly reminder. Could you please supply the new R file for runDWLSDeconv_V1 function to me? Thanks.

XuanCao-CX commented 2 years ago

Hi @leihouyeung,

My Giotto version is from master branch and we have already uploaded function "runDWLSDeconv" in master brunch that you issue was soloved. Please use runDWLSDeconv instead of "runDWLSDeconv_V1" when using the latest updated Giotto master brunch.

Best, Xuan

leihouyeung commented 2 years ago

@XuanCao-CX Thanks so much for your update. It works on some of my datasets, but when I apply this on the other datasets, here comes another issue when running runDWLSDeconv: Error in 1:ceiling(log2(max(wsScaledMinusInf))) : NA/NaN argument Calls: newrunDWLSDeconv ... optimize_deconvolute_dwls -> find_dampening_constant In addition: Warning messages: 1: In max(wsScaledMinusInf) : no non-missing arguments to max; returning -Inf 2: In find_dampening_constant(S, all_exp[Genes], solution_all_exp) : NaNs produced

Maybe it is related to the "spatial_enrichment. R" file (line 1572) that max(wsScaledMinusInf) appears to be 0 in my dataset. Could you please fix this?

leihouyeung commented 2 years ago

@XuanCao-CX KIndly reminder

XuanCao-CX commented 2 years ago

Hi @leihouyeung ,

Since the code works for some of your datasets and the error caused in other of your datasets, you can check the your input spatial data and sign_matrix when running runDWLSDeconv. It means that if you have many clusters and small number of genes in sign_matrix so that some of the clusters aren't assigned genes from sign_matrix and this may cause error you got.

Best, Xuan

leihouyeung commented 2 years ago

@XuanCao-CX Do you have the quantitive restriction that how many minimum expressed genes should be in each cell type? Because I think it is a general problem for people who want to deconvolve their data with a large number of cell types.

For reference, here is the file for sign_matrix. I do not think there are small number of genes in sign_matrix. Please have a check, thanks.

leihouyeung commented 2 years ago

@XuanCao-CX I have tried to debug your code and regulate the hyper-parameter for deconvolving our Visium datasets which could be deconvolved by the other methods, but unfortunately, it was in vain. Could you help me with debugging your SptialDWLS code for deconvolving our Visium datasets? The datasets are here. Thanks.

leihouyeung commented 2 years ago

@XuanCao-CX Kindly reminder.