statgen / demuxlet

Genetic multiplexing of barcoded single cell RNA-seq
Apache License 2.0
116 stars 25 forks source link

Segmentation fault #98

Closed wingkyBs closed 1 year ago

wingkyBs commented 1 year ago

Hi. A segmentation fault occurs during demuxlet execution. I get the same error when I remove the multi-allelic SNP. Also when using --group-list.

Below is the part where the error occurred and the top few lines.

NOTICE [2022/11/09 11:56:59] - Reading 158000000 reads at 20:42092008 and skipping 48941789 NOTICE [2022/11/09 11:57:09] - Reading 710000 variants at 21:11053055, Skipping 435690, Missing 0. NOTICE [2022/11/09 11:57:12] - Reading 160000000 reads at 21:30444023 and skipping 49652639 NOTICE [2022/11/09 11:57:21] - Reading 720000 variants at 22:17818894, Skipping 441512, Missing 0. NOTICE [2022/11/09 11:57:38] - Reading 730000 variants at 22:46657308, Skipping 448238, Missing 0. NOTICE [2022/11/09 11:57:43] - Reading 164000000 reads at X:2786880 and skipping 50859914 NOTICE [2022/11/09 11:57:50] - Reading 740000 variants at X:31579978, Skipping 452945, Missing 0. NOTICE [2022/11/09 11:58:01] - Reading 167000000 reads at X:73485548 and skipping 51766445 NOTICE [2022/11/09 11:58:02] - Reading 750000 variants at X:89396340, Skipping 458413, Missing 0. NOTICE [2022/11/09 11:58:12] - Reading 760000 variants at X:139636164, Skipping 463573, Missing 0. NOTICE [2022/11/09 11:58:13] - Reading 169000000 reads at X:153628943 and skipping 52326225 Segmentation fault

Can you help me solve the problem?

Thanks. Bae.

hyunminkang commented 1 year ago

It might be memory problem. I suggest to use https://github.com/statgen/popscle to see if you have the same issue.

wingkyBs commented 1 year ago

Thanks for your reply. I tried popscle dsc-pileup as you said.


/path/bin/popscle dsc-pileup --sam ./VF/outs/possorted_genome_bam_hg19.bam --vcf ./HB00007167_VF.vcf --group-list ./barcodes.tsv --out /path/dsc-pileup


As a result of this, dsc-pileup.var.gz, plp.gz, umi.gz, and cel.gz files were obtained. However, when performing the popscle demuxlet, a segmentation fault occurred again (the execution time was much shorter than that of the demuxlet using bamfile).


/path/bin/popscle demuxlet --plp /path/dsc-pileup --vcf HB00007167_VF.vcf --field GT --out ./dsc-plp_demuxlet_1



NOTICE [2022/11/14 19:59:27] - Reading variant info 731313:50002290:T:C at 22:50000058:G:A NOTICE [2022/11/14 19:59:27] - Reading variant info 736400:17698397:A:G at X:17695190:C:A NOTICE [2022/11/14 19:59:27] - Reading 740000 variants at X:31579978, Skipping 452945, Missing 0. NOTICE [2022/11/14 19:59:27] - Reading variant info 748828:83750693:T:C at X:83726820:T:G NOTICE [2022/11/14 19:59:27] - Reading 750000 variants at X:89396340, Skipping 458413, Missing 0. NOTICE [2022/11/14 19:59:27] - Reading 760000 variants at X:139636164, Skipping 463573, Missing 0. NOTICE [2022/11/14 19:59:27] - Reading variant info 760613:141362523:G:A at X:141384973:C:T NOTICE [2022/11/14 19:59:27] - Reading variant info 764294:153296621:G:G at X:153297392:A:G Segmentation fault


I really hope this problem can be solved.

Thank you for taking your time to reply.

Thanks. Bae.

hyunminkang commented 1 year ago

Given that it segfaults at the same place, I guess that this is a problem in input VCF. Can you restrict your input VCF to autosomal variants and see if it improves?

wingkyBs commented 1 year ago

As you said, I performed popscle dsc-pileup using vcf excluding the X and Y chromosomes and then popscle demuxlet, and no errors occurred!

No error occurs when performing popscle demuxlet if either the X or Y chromosome is removed. I'm not sure why this is happening.

I'm glad the error was resolved, but I wonder if it's okay to do genotyping without variants of the X and Y chromosomes because of my lack of knowledge. Please tell me your opinion.

(p.s. Segmentation error occurred when demuxlet was performed with a vcf file excluding the X and Y chromosomes and the original bam file as input.)

Thanks. Bae.

hyunminkang commented 1 year ago

Glad that the suggestion worked. As demuxlet assumes diploid model, it would be the best to exclude X/Y for the inference. Later versions may be able to incorporate it to enable more ploidy-aware inferences.

wingkyBs commented 1 year ago

Thank you for your answer. Thank you so much for providing a good program and your quick response.

As an additional question, is it okay genotyping using SNG.BEST.GUESS from the output of popscle demuxlet? Is there a problem with genotyping using SNG.BEST.GUESS for cells estimated as doublets?


After doublet (DBL) removal, I'll genotyping using SNG.BEST.GUESS. thank you!

Thanks. Bae.