wguo-research / scCancer

A package for automated processing of single cell RNA-seq data in cancer
92 stars 39 forks source link

Two doubts, could you please clarify. One, I didn't get plot related with Ambient RNAs contamination fraction estimation. I run the program based on the sample data(KC data) provided in the tutorial. The second doubt is I got some warning messages, after running the scAnnotation module, which is given below #44

Open SaleenaYounus opened 3 months ago

SaleenaYounus commented 3 months ago

There are 6 cell type definitions There are 6 cell type definitions root [ 2024-04-11 14:30:39 ] -----: Fibroblast subtype annotation finished [ 2024-04-11 14:30:39 ] -----: generation of similarity maps [ 2024-04-11 14:30:39 ] -----: malignant cells identification with inferCNV [ 2024-04-11 14:36:21 ] -----: malignant cells identification with XGBoost model [ 2024-04-11 14:36:24 ] -----: cell cycle score estimation [ 2024-04-11 14:36:24 ] -----: stemness score calculation [ 2024-04-11 14:36:30 ] -----: gene set signatures analysis [ 2024-04-11 14:36:44 ] -----: expression programs analysis [ 2024-04-11 14:40:05 ] -----: cell interaction analysis [ 2024-04-11 14:40:37 ] -----: report generating [ 2024-04-11 14:41:32 ] END: Finish scAnnotation

Warning messages: 1: In scale_x_log10() : log-10 transformation introduced infinite values. 2: In asMethod(object) : sparse->dense coercion: allocating vector of size 1.0 GiB 3: In asMethod(object) : sparse->dense coercion: allocating vector of size 1.2 GiB 4: In asMethod(object) : sparse->dense coercion: allocating vector of size 1.2 GiB 5: In get_plot_component(plot, "guide-box") : Multiple components found; returning the first one. To return all, use return_all = TRUE. 6: In get_plot_component(plot, "guide-box") : Multiple components found; returning the first one. To return all, use return_all = TRUE.

czythu commented 3 months ago

Sorry for the late reply. For the first issue, set parameter bool.runSoupx = T; For the second issue, I feel sorry that we are unable to reproduce your results, but currently there is no evidence to suggest that these warnings will affect the analysis results. For example, "log-10 transformation introduced infinite values" means that there are some zero values in the data (turn into infinities on the log-scale), which meets expectations.

SaleenaYounus commented 3 months ago

Thank you so much for the reply. As per your answer I set bool.runSoupx = T in scStatistics, but its shows warning message: I completely followed the tutorial In normalizePath(path.expand(path), winslash, mustWork) : path[1]="C:/Users/Sal001/Documents/KC-example/analysis/clustering/graphclust/clusters.csv": The system cannot find the path specified This is my code I used in scStatistics dataPath <- "./KC-example" savePath <- "./KCnew/KC-example" sampleName <- "KC-example" authorName <- "S-Lab@LU"

Run scStatistics

stat.results <- runScStatistics( dataPath = dataPath, savePath = savePath, sampleName = sampleName, authorName = authorName, bool.runSoupx = T ) This is the full output information [ 2024-04-15 10:21:49 ] START: RUN scStatistics [ 2024-04-15 10:21:49 ] -----: data preparation [ 2024-04-15 10:22:08 ] -----: cell calling [ 2024-04-15 10:22:18 ] -----: nUMI & nGene distribution plot [ 2024-04-15 10:22:19 ] -----: mito & ribo & diss distribution plot [ 2024-04-15 10:22:20 ] -----: gene statistics [ 2024-04-15 10:22:22 ] -----: gene proportion plot [ 2024-04-15 10:22:30 ] -----: ambient genes (SoupX)

Warning message: In normalizePath(path.expand(path), winslash, mustWork) : path[1]="C:/Users/Sal001/Documents/KC-example/analysis/clustering/graphclust/clusters.csv": The system cannot find the path specified

czythu commented 3 months ago

As for the first question, the R package SoupX is used to estimate the proportion of pollution. See https://github.com/constantAmateur/SoupX for more details if you are interested in this module.Running SoupX depends on the cluster information generated by 10X CellRanger. I'm afraid that you need to download other datasets for this information or just skip this step. As for the second question, almost all modules in scCancer2 (except the module mentioned above) are workable when there is only filtered_bc_matrix provided. You can directly run scStatistics, scAnnotation in: ​ https://github.com/czythu/scCancer/blob/master/vignettes/scCancer2.Rmd and scCombination in: https://github.com/czythu/scCancer/blob/master/vignettes/scCancer.Rmd​​ with all our datasets or your own datasets.

Good luck!