sunnyisgalaxy / moabs

A comprehensive, accurate and efficient solution for analysis of large scale base-resolution DNA methylation data
15 stars 10 forks source link

Segmentation fault (core dumped) #22

Open demis001 opened 4 years ago

demis001 commented 4 years ago

I have 256G RAM on the system. I don't know why it is giving the Segmentation fault (core dumped) error. The alignment is from bismark and properly sorted.

mcall -m CDAA158_merged_final_sorted_dedup.deduplicated.bam -m CDAA328_merged_final_sorted_dedup.deduplicated.bam -m CDAA591_merged_final_sorted_dedup.deduplicated.bam -m CDAA94_merged_final_sorted_dedup.deduplicated.bam -p 2 --sampleName CDAA -r ${GENODIR}/hg38_r87.fa --skipRandomChrom 1

lijinbio commented 4 years ago

That is great to fix it. It may take some time to generate the output.

Can you help confirm that the header should not include the descriptions? Thank you.

demis001 commented 4 years ago

Sorry for my stupidity! I should double check the newer version. I thought you haven't used two option for the samething .... -g and -r

demis001 commented 4 years ago

You saved my day, have 100 samples!

demis001 commented 4 years ago

Let me check with bismark bam files too!

lijinbio commented 4 years ago

You saved my day, have 100 samples!

That is great. Thank you for your interest in MOABS.

lijinbio commented 4 years ago
YEP!!! working

100M -rw-rw-r-- 1 ddjima fuse 100M Feb 11 12:41 ADAA411_sorted_dedup.bam.1.G.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 12:37 ADAA411_sorted_dedup.bam.1.HG.bed
8.0K -rw-rw-r-- 1 ddjima fuse 4.5K Feb 11 12:41 ADAA411_sorted_dedup.bam.1_stat.txt

May I ask which version reference you are using? The one whose header does not have the description?

lijinbio commented 4 years ago

Let me check with bismark bam files too!

Please let us know if Bismark BAM will work or not.

demis001 commented 4 years ago

It run this order: Does this mean, the bam order isn't correct?

Writing strand specific statistics for file ADAA411_sorted_dedup.bam and chrom 1
Writing strand combined statistics for file ADAA411_sorted_dedup.bam and chrom 1
Finished processing file ADAA411_sorted_dedup.bam on chrom 1

All files done for chrom 1
Start processing file ADAA411_sorted_dedup.bam on chrom 10
Writing strand specific statistics for file ADAA411_sorted_dedup.bam and chrom 10
Writing strand combined statistics for file ADAA411_sorted_dedup.bam and chrom 10
Finished processing file ADAA411_sorted_dedup.bam on chrom 10

All files done for chrom 10
Start processing file ADAA411_sorted_dedup.bam on chrom 11
Writing strand specific statistics for file ADAA411_sorted_dedup.bam and chrom 11
Writing strand combined statistics for file ADAA411_sorted_dedup.bam and chrom 11
Finished processing file ADAA411_sorted_dedup.bam on chrom 11
lijinbio commented 4 years ago

The order is adapted from the BAM file, and it does not matter once the BAM is correctly sorted.

demis001 commented 4 years ago

Can you run two instance of mcall and mcomp on the same system?

lijinbio commented 4 years ago

Yes, but ensure enough computing resources and non-conflict output directories.

demis001 commented 4 years ago

That was what I recalled, you mean it overwrite the output file right?

demis001 commented 4 years ago

Not working with bismark bam file.

> mcall -m ADAA1381_merged_final_sorted_dedup.deduplicated.bam -p 4 -r /data1/genome/hg38_r87/hg38_r87_update.fa 
Options are saved in file run.config and printed here:
cytosineMinScore=20
excludedFlag=0
fullMode=0
keepTemp=0
mappedFiles=ADAA1381_merged_final_sorted_dedup.deduplicated.bam 
minFragSize=0
minMMFragSize=0
nextBaseMinScore=3
processPEOverlapSeq=1
qualityScoreBase=0
reference=/data1/genome/hg38_r87/hg38_r87_update.fa
reportCHX=X
reportCpX=G
requiredFlag=0
skipRandomChrom=1
statsOnly=0
threads=4
trimRRBSEndRepairSeq=2
trimWGBSEndRepairPE1Seq=3
trimWGBSEndRepairPE2Seq=3
Program started
From the extension of file ADAA1381_merged_final_sorted_dedup.deduplicated.bam, program is parsing file according to BAM foramt
XR:Z or ZR:Z:, or ZS:Z: field not found, that is totally fine.
For file ADAA1381_merged_final_sorted_dedup.deduplicated.bam, the quality score format is Sanger format based at 33!
Protocol and read length are detected as WGBS and 135 bases for file ADAA1381_merged_final_sorted_dedup.deduplicated.bam
For file ADAA1381_merged_final_sorted_dedup.deduplicated.bam the number of all reads is 93966132 and the number of mapped reads is 93966132
Start processing file ADAA1381_merged_final_sorted_dedup.deduplicated.bam on chrom 1
Segmentation fault (core dumped)
lijinbio commented 4 years ago

Yes. To be safe, different samples can be put in different output directories.

lijinbio commented 4 years ago

Segmentation fault (core dumped)

okay. Thank you for letting us know.

demis001 commented 4 years ago

Really appreciate! I have tried two other newer methods and decided to use this after evaluating both. You saved my day! Would you please keep open this issue? I may need down the line, I will run mcall and mcomp today and if I see an error, I will update it. I have 100 samples to run, the alignment is done!

lijinbio commented 4 years ago

Great to help. Thank you for using MOABS. I will keep this issue open.

demis001 commented 4 years ago

Do you know why this generate this?

Merging each group before mcomp:

mcall -m ADAA1381_sorted_dedup.bam -m ADAA1620_sorted_dedup.bam -m ADAA312_sorted_dedup.bam -m ADAA411_sorted_dedup.bam -m ADAA760_sorted_dedup.bam  -p 4 \
     --sampleName ADAA -r /data1/genome/hg38_r87/hg38_r87_update.fa --skipRandomChrom 1 --fullMode 0
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 15:44 CDCA.16.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 15:44 CDCA.16_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 15:59 CDCA.17.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 15:59 CDCA.17_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 15:59 CDCA.18.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 15:59 CDCA.18_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 15:58 CDCA.19.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 15:58 CDCA.19_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 15:24 CDCA.1.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 15:24 CDCA.1_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:12 CDCA.20.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:12 CDCA.20_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:11 CDCA.21.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:11 CDCA.21_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:11 CDCA.22.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:11 CDCA.22_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 15:59 CDCA.2.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 15:59 CDCA.2_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:15 CDCA.3.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:15 CDCA.3_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:24 CDCA.4.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:24 CDCA.4_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:24 CDCA.5.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:24 CDCA.5_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:25 CDCA.6.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:24 CDCA.6_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:33 CDCA.7.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:33 CDCA.7_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:39 CDCA.8.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:39 CDCA.8_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:38 CDCA.9.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:38 CDCA.9_strand.bed
1.5G -rw-rw-r-- 1 ddjima fuse 1.5G Feb 11 16:57 CDCA.G.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:57 CDCA.HG.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:37 CDCA.MT.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:37 CDCA.MT_strand.bed
8.0K -rw-rw-r-- 1 ddjima fuse 5.2K Feb 11 16:57 CDCA_stat.txt
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:52 CDCA.X.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:52 CDCA.X_strand.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:52 CDCA.Y.bed
4.0K -rw-rw-r-- 1 ddjima fuse   48 Feb 11 16:52 CDCA.Y_strand.bed

Expected only this:

1.5G -rw-rw-r-- 1 ddjima fuse 1.5G Feb 11 16:57 CDCA.G.bed
4.0K -rw-rw-r-- 1 ddjima fuse   94 Feb 11 16:57 CDCA.HG.bed
8.0K -rw-rw-r-- 1 ddjima fuse 5.2K Feb 11 16:57 CDCA_stat.txt
sarahet commented 4 years ago

Can you help confirm that the header should not include the descriptions? Thank you.

Hi there, I lost a bit the overview on how this thread went but I just wanted to confirm that indeed fasta headers cannot contain descriptions for mcall. Tested with and without description in the same reference genome with multiple bsmap bam files. If this is anyways already fixed, please ignore. I just thought I drop this here as a confirmation.

lijinbio commented 4 years ago

Can you help confirm that the header should not include the descriptions? Thank you.

Hi there, I lost a bit the overview on how this thread went but I just wanted to confirm that indeed fasta headers cannot contain descriptions for mcall. Tested with and without description in the same reference genome with multiple bsmap bam files. If this is anyways already fixed, please ignore. I just thought I drop this here as a confirmation.

Yes, fasta headers cannot contain descriptions for MCALL. This will be fixed in the next release.

paulmenzel commented 3 years ago

@demis001, is this still an issue with the current release 1.3.9.6?

sklages commented 2 years ago

This should have been fixed in version 1.3.8.8