broadinstitute / infercnv

Inferring CNV from Single-Cell RNA-Seq
Other
558 stars 166 forks source link

Error in transferring CNV to seurat object with add_to_seurat #220

Closed shaofeng2020 closed 2 years ago

shaofeng2020 commented 4 years ago

I created an infercnv object. infercnv_obj = CreateInfercnvObject(raw_counts_matrix=counts_matrix, annotations_file="/home/shaofeng/HM/CNV/tumor.sample.txt", delim="\t", gene_order_file="/home/shaofeng/HM/CNV/chr.txt", ref_group_names=NULL) infercnv_obj = infercnv::run(infercnv_obj, cutoff=0.1, out_dir="./result3", cluster_by_groups=TRUE, denoise=TRUE, HMM=TRUE) All fine, got the graphs. Now I'm trying to transfer the CNV metadata to the seurat objectseurat_obj = infercnv::add_to_seurat(infercnv_output_path="/home/shaofeng/HM/data/result3", seurat_obj=liver, # optional top_n=10) Error in seq_len(nrow(sorted_regions)) :

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

Which version of infercnv are you using?

If you look at the HMM results (figures), are there any predicted CNVs?

Regards, Christophe.

shaofeng2020 commented 4 years ago

Hi GeorgescuC , I can't find the reason. I try add parameters analysis_mode = "subclusters" and re-run. However, the program ran for two days. Finally the program crashed.

infercnv_obj = infercnv::run(infercnv_obj,analysis_mode = "subclusters", cutoff=0.1, # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics out_dir="./result", cluster_by_groups=TRUE, denoise=TRUE,num_threads = 6, HMM=TRUE)

R version 3.6.0 (2019-04-26) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale: [1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C LC_TIME=zh_CN.UTF-8
[4] LC_COLLATE=zh_CN.UTF-8 LC_MONETARY=zh_CN.UTF-8 LC_MESSAGES=zh_CN.UTF-8
[7] LC_PAPER=zh_CN.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] infercnv_1.2.2 cowplot_1.0.0 ggplot2_3.3.0 dplyr_0.8.5 RColorBrewer_1.1-2 [6] Seurat_3.1.4 data.table_1.12.8

STEP 17: HMM-based CNV prediction

INFO [2020-03-25 07:58:07] predict_CNV_via_HMM_on_tumor_subclusters INFO [2020-03-25 07:58:11] -done predicting CNV based on initial tumor subclusters INFO [2020-03-25 09:13:45] get_predicted_CNV_regions(subcluster) INFO [2020-03-25 09:13:45] -processing cell_group_name: tumor.tumor.1.1.1.1, size: 1480 INFO [2020-03-25 09:13:54] -processing cell_group_name: tumor.tumor.1.1.1.2, size: 4202 INFO [2020-03-25 09:14:13] -processing cell_group_name: tumor.tumor.1.1.2.1, size: 595 INFO [2020-03-25 09:14:19] -processing cell_group_name: tumor.tumor.1.1.2.2, size: 892 INFO [2020-03-25 09:14:27] -processing cell_group_name: tumor.tumor.1.2.1.1, size: 2327 INFO [2020-03-25 09:14:39] -processing cell_group_name: tumor.tumor.1.2.1.2, size: 1573 INFO [2020-03-25 09:14:49] -processing cell_group_name: tumor.tumor.1.2.2.1, size: 1066 INFO [2020-03-25 09:14:57] -processing cell_group_name: tumor.tumor.1.2.2.2, size: 277 INFO [2020-03-25 09:15:02] -writing cell clusters file: ./result/17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.cell_groupings INFO [2020-03-25 09:15:02] -writing cnv regions file: ./result/17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.pred_cnv_regions.dat INFO [2020-03-25 09:15:02] -writing per-gene cnv report: ./result/17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.pred_cnv_genes.dat INFO [2020-03-25 09:15:02] -writing gene ordering info: ./result/17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.genes_used.dat INFO [2020-03-25 09:15:03] ::plot_cnv:Start INFO [2020-03-25 09:15:03] ::plot_cnv:Current data dimensions (r,c)=4748,12412 Total=178863151 Min=2 Max=6. INFO [2020-03-25 09:15:03] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2020-03-25 09:16:01] plot_cnv_observation:Start INFO [2020-03-25 09:16:01] Observation data size: Cells= 12412 Genes= 4748 INFO [2020-03-25 09:16:01] clustering observations via method: ward.D error: C stack usage 7972772 is too close to the limit

01_incoming_data.infercnv_obj 17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.genes_used.dat 02_reduced_by_cutoff.infercnv_obj 17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.infercnv_obj 03_normalized_by_depthHMMi6.infercnv_obj 17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.pred_cnv_genes.dat 04_logtransformedHMMi6.infercnv_obj 17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.pred_cnv_regions.dat 07_tumor_subclustersHMMi6.rand_trees.random_trees.infercnv_obj expr.infercnv.17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.dat 08_remove_ref_avg_from_obs_logFCHMMi6.rand_trees.infercnv_obj expr.infercnv.preliminary.dat 09_apply_max_centered_expr_thresholdHMMi6.rand_trees.infercnv_obj infercnv.preliminary.heatmap_thresholds.txt 10_smoothed_by_chrHMMi6.rand_trees.infercnv_obj infercnv.preliminary.observation_groupings.txt 11_recentered_cells_by_chrHMMi6.rand_trees.infercnv_obj infercnv.preliminary.observations_dendrogram.txt 12_remove_ref_avg_from_obs_adjustHMMi6.rand_trees.infercnv_obj infercnv.preliminary.observations.txt 14_invert_log_transformHMMi6.rand_trees.infercnv_obj infercnv.preliminary.png 17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.cell_groupings preliminary.infercnv_obj

shaofeng2020 commented 4 years ago

Hi GeorgescuC , I ran the program again without parameters analysis_mode = "subclusters". However, I failed to transfer the CNV metadata to the seurat objectseurat_obj . infercnv_obj = CreateInfercnvObject(raw_counts_matrix=counts_matrix, annotations_file="/home/shaofeng/HM/CNV/tumor.sample.txt", delim="\t", gene_order_file="/home/shaofeng/HM/CNV/chr.txt", ref_group_names=NULL) infercnv_obj = infercnv::run(infercnv_obj, cutoff=0.1, # cutoff=1 works well for Smart-seq2, and cutoff=0.1 works well for 10x Genomics out_dir="./result", cluster_by_groups=TRUE, denoise=TRUE,num_threads = 6, HMM=TRUE)

seurat_obj = infercnv::add_to_seurat(infercnv_output_path="/home/shaofeng/HM/data/result", seurat_obj=liver.integrated, # optional top_n=10)

Error in seq_len(nrow(sorted_regions)) :

01_incoming_data.infercnv_obj infercnv.17_HMM_predHMMi6.hmm_mode-samples.observations_dendrogram.txt 02_reduced_by_cutoff.infercnv_obj infercnv.17_HMM_predHMMi6.hmm_mode-samples.observations.txt 03_normalized_by_depthHMMi6.infercnv_obj infercnv.17_HMM_predHMMi6.hmm_mode-samples.png 04_logtransformedHMMi6.infercnv_obj infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.heatmap_thresholds.txt 08_remove_ref_avg_from_obs_logFCHMMi6.infercnv_obj infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.observation_groupings.txt 09_apply_max_centered_expr_thresholdHMMi6.infercnv_obj infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.observations_dendrogram.txt 10_smoothed_by_chrHMMi6.infercnv_obj infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.observations.txt 11_recentered_cells_by_chrHMMi6.infercnv_obj infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.png 12_remove_ref_avg_from_obs_adjustHMMi6.infercnv_obj infercnv.21_denoised.heatmap_thresholds.txt 14_invert_log_transformHMMi6.infercnv_obj infercnv.21_denoised.observation_groupings.txt 15_no_subclusteringHMMi6.infercnv_obj infercnv.21_denoised.observations_dendrogram.txt 17_HMM_predHMMi6.hmm_mode-samples.cell_groupings infercnv.21_denoised.observations.txt 17_HMM_predHMMi6.hmm_mode-samples.genes_used.dat infercnv.21_denoised.png 17_HMM_predHMMi6.hmm_mode-samples.infercnv_obj infercnv.heatmap_thresholds.txt 17_HMM_predHMMi6.hmm_mode-samples.pred_cnv_genes.dat infercnv.observation_groupings.txt 17_HMM_predHMMi6.hmm_mode-samples.pred_cnv_regions.dat infercnv.observations_dendrogram.txt 19_HMM_pred.repr_intensitiesHMMi6.hmm_mode-samples.Pnorm_0.5.infercnv_obj infercnv.observations.txt 21_denoiseHMMi6.NF_NA.SD_1.5.NL_FALSE.infercnv_obj infercnv.png expr.infercnv.17_HMM_predHMMi6.hmm_mode-samples.dat infercnv.preliminary.heatmap_thresholds.txt expr.infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.dat infercnv.preliminary.observation_groupings.txt expr.infercnv.21_denoised.dat infercnv.preliminary.observations_dendrogram.txt expr.infercnv.dat infercnv.preliminary.observations.txt expr.infercnv.preliminary.dat infercnv.preliminary.png infercnv.17_HMM_predHMMi6.hmm_mode-samples.heatmap_thresholds.txt preliminary.infercnv_obj infercnv.17_HMM_predHMMi6.hmm_mode-samples.observation_groupings.txt run.fin

shaofeng2020 commented 4 years ago

Hi Christophe: I found the content of file “17_HMM_predHMMi6.hmm_mode-samples.pred_cnv_regions.dat“ was headers. There was no cnv_regoins data in file. I think this may be the reason. But why did this happen? I am confused. Thank you for your help.

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

For the C stack usage error, this is an error due to the system configuration. You should be able to get around it by running ulimit -s unlimited from outside of R before running R/infercnv. From the first log you sent, it seems only the plotting of step 17 was affected and rerunning infercnv started again at step 18, but there may be some other issues that happened.

If you open infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.png, do you see any signal on the heatmaps?

Regards, Christophe.

shaofeng2020 commented 4 years ago

Hi Christophe: Thank you for your help. When I open infercnv.19_HMM_predHMMi6.hmm_mode-samples.Pnorm_0.5.repr_intensities.png, there was no signal on the heatmaps.

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

If there is no signal on the heatmaps, it is normal that there is nothing in the pred_cnv_regions.dat . If you take a look at the non HMM results (infercnv.png), do you see any consistent signal between multiple cells?

Regards, Christophe.

shaofeng2020 commented 4 years ago

Hi Christophe: The CNV is different between references and observations in the infercnv.png. And the infercnv.21_denoised.png also show consistent signal between references and observations.

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

I think the issue is that since you have rerun infercnv without analysis_mode subclusters, the HMM tries to give you a prediction for all cells at once, while you most likely have multiple different subclonal populations of cells. In that case, the most common status (in number of cells having it) for each gene/region takes over for the prediction across all cells. You likely need to rerun things with the subcluster analysis mode, after having set ulimit -s unlimited .

Regards, Christophe.

shaofeng2020 commented 4 years ago

Hi GeorgescuC , I try add parameters analysis_mode = "subclusters" and re-run.Now I'm trying to transfer the CNV metadata to the seurat object seurat_obj=infercnv::add_to_seurat(infercnv_output_path="/home/shaofeng/HM/data/result2",seurat_obj=liver, # optionaltop_n=10) Error in infercnv::add_to_seurat(infercnv_output_path = "home/shaofeng/HM/CNV/result2" The files in result2 is: -rw-rw-r--. 1 shaofeng shaofeng 91M 4月 27 22:12 01_incoming_data.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 53M 4月 27 22:12 02_reduced_by_cutoff.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 55M 4月 27 22:12 03_normalized_by_depth.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 56M 4月 27 22:13 04_logtransformed.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 57M 4月 30 17:55 07_tumor_subclusters.rand_trees.random_trees.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 598M 4月 30 17:56 08_remove_ref_avg_from_obs_logFC.rand_trees.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 598M 4月 30 17:57 09_apply_max_centered_expr_threshold.rand_trees.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 599M 4月 30 18:02 10_smoothed_by_chr.rand_trees.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 599M 4月 30 18:03 11_recentered_cells_by_chr.rand_trees.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 600M 4月 30 18:04 12_remove_ref_avg_from_obs_adjust.rand_trees.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 571M 4月 30 18:05 14_invert_log_transform.rand_trees.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 127M 4月 30 19:36 21_denoise.rand_trees.NF_NA.SD_1.5.NL_FALSE.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 1.3G 4月 30 19:37 expr.infercnv.21_denoised.dat -rw-rw-r--. 1 shaofeng shaofeng 1.3G 4月 30 21:07 expr.infercnv.dat -rw-rw-r--. 1 shaofeng shaofeng 1.3G 4月 30 18:08 expr.infercnv.preliminary.dat -rw-rw-r--. 1 shaofeng shaofeng 279 4月 30 19:38 infercnv.21_denoised.heatmap_thresholds.txt -rw-rw-r--. 1 shaofeng shaofeng 619K 4月 30 19:38 infercnv.21_denoised.observation_groupings.txt -rw-rw-r--. 1 shaofeng shaofeng 580K 4月 30 19:38 infercnv.21_denoised.observations_dendrogram.txt -rw-rw-r--. 1 shaofeng shaofeng 983M 4月 30 20:56 infercnv.21_denoised.observations.txt -rw-rw-r--. 1 shaofeng shaofeng 776K 4月 30 21:05 infercnv.21_denoised.png -rw-rw-r--. 1 shaofeng shaofeng 296M 4月 30 21:05 infercnv.21_denoised.references.txt -rw-rw-r--. 1 shaofeng shaofeng 280 4月 30 21:07 infercnv.heatmap_thresholds.txt -rw-rw-r--. 1 shaofeng shaofeng 619K 4月 30 21:07 infercnv.observation_groupings.txt -rw-rw-r--. 1 shaofeng shaofeng 580K 4月 30 21:07 infercnv.observations_dendrogram.txt -rw-rw-r--. 1 shaofeng shaofeng 983M 4月 30 22:26 infercnv.observations.txt -rw-rw-r--. 1 shaofeng shaofeng 771K 4月 30 22:34 infercnv.png -rw-rw-r--. 1 shaofeng shaofeng 279 4月 30 18:08 infercnv.preliminary.heatmap_thresholds.txt -rw-rw-r--. 1 shaofeng shaofeng 619K 4月 30 18:08 infercnv.preliminary.observation_groupings.txt -rw-rw-r--. 1 shaofeng shaofeng 580K 4月 30 18:08 infercnv.preliminary.observations_dendrogram.txt -rw-rw-r--. 1 shaofeng shaofeng 1002M 4月 30 19:27 infercnv.preliminary.observations.txt -rw-rw-r--. 1 shaofeng shaofeng 2.0M 4月 30 19:36 infercnv.preliminary.png -rw-rw-r--. 1 shaofeng shaofeng 302M 4月 30 19:36 infercnv.preliminary.references.txt -rw-rw-r--. 1 shaofeng shaofeng 296M 4月 30 22:34 infercnv.references.txt -rw-rw-r--. 1 shaofeng shaofeng 571M 4月 30 18:06 preliminary.infercnv_obj -rw-rw-r--. 1 shaofeng shaofeng 127M 4月 30 21:05 run.final.infercnv_obj

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

Is there any additional information in the error log? If not, could you try to run options(error = function() traceback(2)) before rerunning the add_to_seurat command to have more information?

Regards, Christophe.

shaofeng2020 commented 4 years ago

Hi GeorgescuC , I run options(error = function() traceback(2)). The error is "Could not find "run.final.infercnv_obj" file at: home/shaofeng/HM/CNV/result2/run.final.infercnv_obj".

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

It looks like there is an issue with the output path supplied. From the error logs, it looks like you are missing the first "/" at the beginning of the infercnv_output_path argument, but I am not quiet sure as there are 2 different paths in your messages:

seurat_obj=infercnv::add_to_seurat(infercnv_output_path="/home/shaofeng/HM/data/result2",seurat_obj=liver, # optionaltop_n=10) Error in infercnv::add_to_seurat(infercnv_output_path = "home/shaofeng/HM/CNV/result2" "Could not find "run.final.infercnv_obj" file at: home/shaofeng/HM/CNV/result2/run.final.infercnv_obj"

Regards, Christophe.

shaofeng2020 commented 4 years ago

Hi GeorgescuC , I run options(error = function() traceback(2)). The error is "WARN [2020-05-11 23:38:21] ::Could not find any HMM predictions outputs at: /home/shaofeng/HM/CNV/result2,Error in infercnv::add_to_seurat(infercnv_output_path = "/home/shaofeng/HM/CNV/result2" file at: /home/shaofeng/HM/CNV/result2/run.final.infercnv_obj".

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

From the latest list of files you sent (in result2), the HMM option was not set to True in that run as the output files jump from step 14 to 21, while the HMM should be the 17th.

Regards, Christophe.

fdkuo commented 4 years ago

Hi @GeorgescuC,

I have encountered a similar problem while running infercnv::add_to_seurat() for generating the map_metadata_from_infercnv.txt with seurat_obj=NULL. The error message states "Error in seq_len(nrow(sorted_regions)) : argument must be coercible to non-negative integer". I can find "17_HMM_predHMMi6.hmm_mode-samples.pred_cnv_genes.dat" and "HMM_CNV_predictions.HMMi6.hmm_mode-samples.Pnorm_0.5.pred_cnv_genes.dat" under the output dir. After tracing the steps in seurat_interaction.R, I think the error came from the .get_top_n_regions() call which was called twice (line#280 & #281) in the .get_features(). One call is for loss regions and the other is for duplication regions. In my case, the problem happened when there's no predicted loss region and 0 length of sorted_regions_loss was passed to .get_top_n_regions() call. When you have chance could you please help take a look?

Thanks

My version of infercnv installed is 1.4.0 and was run as below. infercnv::run(infercnv_obj, cutoff=0.1, # use 1 for smart-seq, 0.1 for 10x-genomics out_dir="output_dir", # dir is auto-created for storing outputs cluster_by_groups=T, # cluster denoise=T, HMM=T )

GeorgescuC commented 4 years ago

Hi @fdkuo ,

I just pushed a change to address this issue on the master branch, would you be willing to try it out by updating your version from github?

To install the new version you should simply have to run: library("devtools") devtools::install_github("broadinstitute/infercnv", ref="master")

Regards, Christophe.

fdkuo commented 4 years ago

Hi @GeorgescuC,

Thanks a lot for the change. Unfortunately, the installation steps had failed with error message below. I also tried reinstall through BioManager which ended up with still version 1.4.0 and having the same problem. Feel free to let me know what to do if you need more information. My R version 4.0.0.

Best,

GeorgescuC commented 4 years ago

Hi @fdkuo ,

My bad, there was a missing parenthesis. The code on the master branch should now work.

Regards, Christophe.

fdkuo commented 4 years ago

Hi @GeorgescuC,

No problem. I saw the parenthesis and the "if (is.null(sorted_regions))" statement now and the installation done w/o problem. However, with that extra evaluation the same error is still happened. My testing shows that even there's no loss region found the statement: "sorted_regions_loss = sort(table(hmm_genes$gene_region_name[hmm_genes$state < center_state]), decreasing=TRUE)"

It actually returned 0 length integer vector so it passed the is.null() evaluation. May I suggest using something like "if (is.null(sorted_regions) | length(sorted_regions) == 0)" for evaluation?

Best,

INFO [2020-06-03 08:25:17] No Seurat object provided, will only write metadata matrix. Error in seq_len(nrow(sorted_regions)) : argument must be coercible to non-negative integer In addition: Warning message: In seq_len(nrow(sorted_regions)) : first element used of 'length.out' argument

fdkuo commented 4 years ago

Hi @GeorgescuC,

Many thanks for the new changes. The new if statement is working and it passed the step. However it failed with error message below. The problem seems happened at line 321 writeLines(to_write, con=fileConn) for saving results of top_n_loss to top_losses.txt. Since there's no reported loss regions there's nothing in the to_write vector to write. Thanks.

Best,

INFO [2020-06-04 09:53:01] No Seurat object provided, will only write metadata matrix. Error in writeLines(to_write, con = fileConn) : can only write character objects

shaofeng2020 commented 4 years ago

Hi Christophe: I run ulimit -s unlimited from outside of R. Then I run infercnv_obj = infercnv::run(infercnv_obj,cutoff=0.1, out_dir="./result2", analysis_mode = "subclusters", cluster_by_groups=TRUE, denoise=TRUE, HMM = T,num_threads = 6) The STEP 17: HMM-based CNV prediction was stopped. The error is node stack overflow.

GeorgescuC commented 4 years ago

Hi @fdkuo ,

I went a step too far when testing, table() returns < table of extent 0 >, sort(table()) returns integer(0), and nrow(sort(table())) returns NULL. Just checking that sort < 1 or is.null(nrow) should be enough, but I left both.

Regards, Christophe.

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

Can you also try increasing the R recursion limit by running options(expressions=50000) and then trying the run again?

Regards, Christophe

fdkuo commented 4 years ago

Hi @GeorgescuC,

Thanks. Indeed, on my test "is.null(nrow(integer()))" is TRUE so "is.null(nrow(sorted_regions))" should be good enough. It alone should allow the call passing through. However, another hurdle down the road in .get_features() at "writeLines(to_write, con=fileConn)". Since nothing has been added to to_write, the statement failed with error I mentioned earlier. Do you want to add a size checking for the writeLines()?

Best,

GeorgescuC commented 4 years ago

Hi @fdkuo ,

I added a check that there is output to write, otherwise it will generate an empty file without giving an error (outputs "").

Best, Christophe.

fdkuo commented 4 years ago

Hi @GeorgescuC,

Work without any error now! Many thanks for the fixes.

Best,

shaofeng2020 commented 4 years ago

Hi Christophe: I run options(expressions=50000) and try run infercnv_obj = infercnv::run(infercnv_obj,cutoff=0.1, out_dir="./result2", analysis_mode = "subclusters", cluster_by_groups=TRUE, denoise=TRUE, HMM = T,num_threads = 6). However the STEP 17: HMM-based CNV prediction was stopped. The error is node stack overflow. STEP 17: HMM-based CNV prediction INFO [2020-06-08 20:37:04] predict_CNV_via_HMM_on_tumor_subclusters INFO [2020-06-08 20:37:09] -done predicting CNV based on initial tumor subclusters INFO [2020-06-08 22:04:02] get_predicted_CNV_regions(subcluster) INFO [2020-06-08 22:04:02] -processing cell_group_name: Tumor.Tumor.1.1.1.1, size: 1899 INFO [2020-06-08 22:04:14] -processing cell_group_name: Tumor.Tumor.1.1.1.2, size: 4459 INFO [2020-06-08 22:04:34] -processing cell_group_name: Tumor.Tumor.1.1.2.1, size: 700 INFO [2020-06-08 22:04:41] -processing cell_group_name: Tumor.Tumor.1.1.2.2, size: 799 INFO [2020-06-08 22:04:48] -processing cell_group_name: Tumor.Tumor.1.2.1.1, size: 1782 INFO [2020-06-08 22:04:59] -processing cell_group_name: Tumor.Tumor.1.2.1.2, size: 2479 INFO [2020-06-08 22:05:12] -processing cell_group_name: Tumor.Tumor.1.2.2.1, size: 178 INFO [2020-06-08 22:05:17] -processing cell_group_name: Tumor.Tumor.1.2.2.2, size: 116 INFO [2020-06-08 22:05:21] -processing cell_group_name: Normal.Normal.1.1.1.1, size: 1128 INFO [2020-06-08 22:05:30] -processing cell_group_name: Normal.Normal.1.1.1.2, size: 80 INFO [2020-06-08 22:05:35] -processing cell_group_name: Normal.Normal.1.1.2.1, size: 1409 INFO [2020-06-08 22:05:44] -processing cell_group_name: Normal.Normal.1.1.2.2, size: 117 INFO [2020-06-08 22:05:48] -processing cell_group_name: Normal.Normal.1.2.1.1, size: 232 INFO [2020-06-08 22:05:54] -processing cell_group_name: Normal.Normal.1.2.1.2, size: 322 INFO [2020-06-08 22:05:59] -processing cell_group_name: Normal.Normal.1.2.2.1, size: 240 INFO [2020-06-08 22:06:04] -processing cell_group_name: Normal.Normal.1.2.2.2, size: 207 INFO [2020-06-08 22:06:09] -writing cell clusters file: ./result2/17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.cell_groupings INFO [2020-06-08 22:06:09] -writing cnv regions file: ./result2/17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.pred_cnv_regions.dat INFO [2020-06-08 22:06:09] -writing per-gene cnv report: ./result2/17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.pred_cnv_genes.dat INFO [2020-06-08 22:06:09] -writing gene ordering info: ./result2/17_HMM_predHMMi6.rand_trees.hmm_mode-subclusters.genes_used.dat INFO [2020-06-08 22:06:10] ::plot_cnv:Start INFO [2020-06-08 22:06:10] ::plot_cnv:Current data dimensions (r,c)=4862,16147 Total=236469675 Min=2 Max=6. INFO [2020-06-08 22:06:11] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2020-06-08 22:07:27] plot_cnv_observation:Start INFO [2020-06-08 22:07:27] Observation data size: Cells= 12412 Genes= 4862 INFO [2020-06-08 22:07:27] clustering observations via method: ward.D The error is node stack overflow.

GeorgescuC commented 4 years ago

Hi @shaofeng2020 ,

Was the ulimit -s unlimited still active in the session you ran infercnv with options(expressions=50000)? The node stack overflow error at the step you are (plotting) should be linked to either the hclust method or the as.dendrogram conversion, which are methods called from other common packages. If the error persists, you can try finishing the run without plots by setting no_plot=TRUE in the options, and try to run the plotting on a different machine with the transferred results.

Regards, Christophe.