Closed parkjooyoung99 closed 12 months ago
Hi @parkjooyoung99 ,
There has been similar issues due to non-full-rank matrix input for optim()
(simply google the error message). What's the dimension of your gene expression matrix? It should be pretty rare for a typical gene expression matrix to not be full rank.
Thanks for your comment! I have checked dimension inputs and found i only have filtered count data. I run spaceranger with fastqs and the probelm solved :)
Maybe you have some spots with 0 counts? We have seen some spots like that in some samples.
@parkjooyoung99 glad you figured it out - yes, we need the unfiltered data (tissue+background spots) as input. I will close this issue unless you have further questions.
@lcolladotor That's a possible reason, though I believe I implemented some feature to remove empty genes/spots.
Hi @zijianni,
Cool, good to know!
As a suggestion, you could use tryCatch()
to catch this error and provide a
more informative error message to users reminding them to use the raw
unfiltered data.
Best, Leo
Thanks @lcolladotor , I'm excited to learn about this function!
Hi @zijianni,
You can see some examples at https://github.com/lcolladotor/derfinder/search?q=trycatch or even the official Bioconductor one at https://contributions.bioconductor.org/querying-web-resources.html
Best, Leo
Hi, The same issue occurred on my data. I am using the raw count.
> decont_obj <- tryCatch(spotclean(m.obj), error = identity)
2023-11-01 14:24:03 Start.
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Kept 4948 highly expressed or highly variable genes.
2023-11-01 14:24:09 Estimating contamination parameters...
| | 0%>
>
>
>
> decont_obj
<simpleError in optim(x_init, .fn_optim, .gr_optim, method = "L-BFGS-B", obs_exp = obs_exp, ts_idx = ts_idx, nonzero_pos = nonzero_pos, n_spots = n_spots, W_yy = W_yy, WtW = WtW, Wyy_tWyy = Wyy_tWyy, I_yy = I_yy, I1_yy = I1_yy, WtZ = WtZ, I1tZ = I1tZ, lower = lower_bounds, upper = upper_bounds, control = list(maxit = 100)): non-finite value supplied by optim>
There are some many genes at some spot have no expression. But they have values for at least one spots. Some spots had very few counts.
> sum(rowSums(m.obj@assays@data$raw) > 0)
[1] 13174
sum(colSums(m.obj@assays@data$raw) > 0)
[1] 4992
> sum(colSums(m.obj@assays@data$raw) > 10)
[1] 4992
> sum(colSums(m.obj@assays@data$raw) > 100)
[1] 4988
> sum(colSums(m.obj@assays@data$raw) > 1000)
[1] 4845
Is there any suggested solution for this?
Hi @yeswzc , I see that all your spots have non-zero counts when you are using all genes. Can you try checking the UMI counts when you only keep variable genes, i.e. output of keepHighGene
?
Hi, I tried using the filtered genes but still get the same error:
gene.to.keep <- keepHighGene(m.raw)
m.raw <- m.raw[gene.to.keep,]
> m.obj <- createSlide(count_mat = m.raw, slide_info = m.slideInfo)
> decont_obj <- spotclean(m.obj)
2023-11-03 13:42:12 Start.
Kept 4911 highly expressed or highly variable genes.
2023-11-03 13:42:19 Estimating contamination parameters...
| | 0%Error in optim(x_init, .fn_optim, .gr_optim, method = "L-BFGS-B", obs_exp = obs_exp, :
non-finite value supplied by optim
I am also attaching an image to see if some more information can help us figure out how to solve this.
Ah, thanks for sharing the slide image - looks like you have a very big tissue slice that covers the whole slide. Can you help validate if you have any non-tissue spots?
If the answer is no (all spots are tissue spots), then SpotClean is unable to perform the decontamination. SpotClean relies on UMI counts in non-tissue spots to learn the extent and distribution of spot swapping contamination. If there is no non-tissue spots, the swapped UMI counts are fully confounded with original UMI counts in each spot.
Even if you have a few (<100 maybe) non-tissue spots, the model may still not able to properly learn the distribution of spot swapping contanimation. We've noted in our package vignette that SpotClean works better when there are more than 25% non-tissue spots.
Ah, thanks for sharing the slide image - looks like you have a very big tissue slice that covers the whole slide. Can you help validate if you have any non-tissue spots?
If the answer is no (all spots are tissue spots), then SpotClean is unable to perform the decontamination. SpotClean relies on UMI counts in non-tissue spots to learn the extent and distribution of spot swapping contamination. If there is no non-tissue spots, the swapped UMI counts are fully confounded with original UMI counts in each spot.
Even if you have a few (<100 maybe) non-tissue spots, the model may still not able to properly learn the distribution of spot swapping contanimation. We've noted in our package vignette that SpotClean works better when there are more than 25% non-tissue spots.
Thank you! I can confirm that there no non tissue spot. I am sorry to hear that SpotClean cannot work on this data.
There are alternative approaches you can explore, e.g. Zhang et al. (though you still have to validate if they work without non-tissue spots.). And it's still worth making a call - if you observe marker genes expressing in nearby regions that they are not supposed to express, it's possibly due to spot swapping effect, even if you might not be able to computationally validate and correct for it.
There are alternative approaches you can explore, e.g. Zhang et al. (though you still have to validate if they work without non-tissue spots.). And it's still worth making a call - if you observe marker genes expressing in nearby regions that they are not supposed to express, it's possibly due to spot swapping effect, even if you might not be able to computationally validate and correct for it.
Thank you for the recommended ref. Unfortunately, it also does not work with data without non-tissue spots. I guess maybe I will not perform correction in my analysis.
$bleeding_correction --adata dataset_filtered.h5ad --adata-output dataset_filtered_corrected.h5ad --bleed-out bleed_correction_results.h5
Traceback (most recent call last):
File "/home/wuz6/.local/bin/bleeding_correction", line 8, in <module>
sys.exit(main())
File "/home/wuz6/.local/lib/python3.10/site-packages/bayestme/cli/bleeding_correction.py", line 51, in main
(cleaned_dataset, bleed_correction_result) = bleeding_correction.clean_bleed(
File "/home/wuz6/.local/lib/python3.10/site-packages/bayestme/bleeding_correction.py", line 769, in clean_bleed
raise RuntimeError("Cannot run clean bleed without non-tissue spots.")
RuntimeError: Cannot run clean bleed without non-tissue spots.
Hello,
I am using spotclean to correct bleeding effect with 'spotclean' function. However, I am facing error message.
If anyone have any idea about it, please let me know.
Thank you!