Open 39652its opened 1 year ago
Hi @39652its ,
Could you also share the residual expression figure? The HMM results are based on the residual expression so it is important to compare both to verify if anything wrong might have happened. Looking at the signal in your references it could be that there are 2 populations of cells that should be separated, which can be done with the "num_ref_groups" option.
I would also leave the option cluster_references=TRUE
to the default so that references are ordered by the clustering.
With the 2nd run and set of options, the results do look much better as the subclustering is not oversplit. To make sure the profiles are good though, I would check as in 1. by comparing the residual expression, and trying to identify why some signal remains in the references.
When you modify an option such as leiden_resolution
, infercnv should take care of picking back up the analysis from the last step before the changed options have an effect on its own. The backup up objects generated after each step contain the options that were used to generate them, so when it tries to reload them, your current run's option are compared to it. If any option that affects results by that step is found to be different, infercnv ignores that backup and checks the one from the previous step, until the "most recent common ancestor" is found.
Regards, Christophe.
Dear developer,
Thank you for your kind words and appreciation for this powerful tool. As an infercnv beginner, I have several doubts and areas where I lack understanding. I would like to seek your advice and guidance on these matters to gain more expertise.
In my research, I primarily utilize 10x scRNA-seq data. My goal is to employ infercnv to infer the copy number profile of each cell and subsequently conduct further analysis. As a result, I have tried the following code and obtained the following results:
However, I encountered a situation where each individual cell forms its own cluster, and there are also copy number variations observed in the reference cells (which were determined to be normal cells in a public dataset). In addition, a similar situation arises in the observation cells, where after the Seurat pipeline and SingleR annotation, normal cells (immune cells, Stellate) also exhibit the same condition.
After going through related issues and your responses, I attempted the analysis again, adjusting the leiden_resolution, tumor_subcluster_pval, and some data output settings. Since I aim to obtain the copy number profile for each cell, here is the code for my second attempt and the resulting outcomes:
However, there is still a situation where copy number variation occurs in both the reference cells and normal immune cells.
Furthermore, the program has been running for more than a week without finishing, which may pose difficulties in trying multiple parameter sets. Therefore, I would like to seek professional advice.
leiden_resolution
, how should I proceed? For example, should I reload the previously outputted data from infercnv, or should I clear certain data to ensure smooth execution? Or should it depend on which parameters I want to change? I came across discussions regarding fine-tuning plot adjustments or changing reference cells, but I'm still not clear about modifying the parameters.I would greatly appreciate your assistance in answering my questions and providing professional advice.
Thank you.