cancerit / ascatNgs

Somatic copy number analysis using WGS paired end wholegenome sequencing
http://cancerit.github.io/ascatNgs/
GNU Affero General Public License v3.0
68 stars 17 forks source link

error while running ascatngs #80

Closed gbnci closed 5 years ago

gbnci commented 6 years ago

I have tried to run ascatngs recently but got errors like: "/usr/bin/time /scratch/wangyong/all/0A4I3S_ASCAT/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.sh 1> /scratch/wangyong/all/0A4I3S_ASCAT/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.out 2> /scratch/wangyong/all/0A4I3S_ASCAT/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.err" unexpectedly returned exit value 1 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 270. at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 268

my command line is: module load ascatngs;ascat.pl -o 0A4I0W_ASCAT -t 0A4I0W_Tumor.realigned.md.bam -n 0A4I0W_Normal.realigned.md.bam -r /fdb/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa -snp_gc /data/CCRBioinfo/wangyh/SnpGcCorrections.tsv -gc chrY -pr wgs -g XY

Not quite sure where the errors came from. Would you please give me some suggestions? For your convenience, I list the bam header below. Checking the output directory, I got for sub folders "allele_count, ascat, logs and progress", all the files in the "progress" has only "zero" size.

bam file header: @HD VN:1.5 GO:none SO:coordinate @SQ SN:chrM LN:16571 @SQ SN:chr1 LN:249250621 @SQ SN:chr2 LN:243199373 @SQ SN:chr3 LN:198022430 @SQ SN:chr4 LN:191154276 @SQ SN:chr5 LN:180915260 @SQ SN:chr6 LN:171115067 @SQ SN:chr7 LN:159138663 @SQ SN:chr8 LN:146364022 @SQ SN:chr9 LN:141213431 @SQ SN:chr10 LN:135534747 @SQ SN:chr11 LN:135006516 @SQ SN:chr12 LN:133851895 @SQ SN:chr13 LN:115169878 @SQ SN:chr14 LN:107349540 @SQ SN:chr15 LN:102531392 @SQ SN:chr16 LN:90354753 @SQ SN:chr17 LN:81195210 @SQ SN:chr18 LN:78077248 @SQ SN:chr19 LN:59128983 @SQ SN:chr20 LN:63025520 @SQ SN:chr21 LN:48129895 @SQ SN:chr22 LN:51304566 @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @SQ SN:chr1_gl000191_random LN:106433 @SQ SN:chr1_gl000192_random LN:547496 @SQ SN:chr4_ctg9_hap1 LN:590426 @SQ SN:chr4_gl000193_random LN:189789 @SQ SN:chr4_gl000194_random LN:191469 @SQ SN:chr6_apd_hap1 LN:4622290 @SQ SN:chr6_cox_hap2 LN:4795371 @SQ SN:chr6_dbb_hap3 LN:4610396 @SQ SN:chr6_mann_hap4 LN:4683263 @SQ SN:chr6_mcf_hap5 LN:4833398 @SQ SN:chr6_qbl_hap6 LN:4611984 @SQ SN:chr6_ssto_hap7 LN:4928567 @SQ SN:chr7_gl000195_random LN:182896 @SQ SN:chr8_gl000196_random LN:38914 @SQ SN:chr8_gl000197_random LN:37175 @SQ SN:chr9_gl000198_random LN:90085 @SQ SN:chr9_gl000199_random LN:169874 @SQ SN:chr9_gl000200_random LN:187035 @SQ SN:chr9_gl000201_random LN:36148 @SQ SN:chr11_gl000202_random LN:40103 @SQ SN:chr17_ctg5_hap1 LN:1680828 @SQ SN:chr17_gl000203_random LN:37498 @SQ SN:chr17_gl000204_random LN:81310 @SQ SN:chr17_gl000205_random LN:174588 @SQ SN:chr17_gl000206_random LN:41001 @SQ SN:chr18_gl000207_random LN:4262 @SQ SN:chr19_gl000208_random LN:92689 @SQ SN:chr19_gl000209_random LN:159169 @SQ SN:chr21_gl000210_random LN:27682 @SQ SN:chrUn_gl000211 LN:166566 @SQ SN:chrUn_gl000212 LN:186858 @SQ SN:chrUn_gl000213 LN:164239 @SQ SN:chrUn_gl000214 LN:137718 @SQ SN:chrUn_gl000215 LN:172545 @SQ SN:chrUn_gl000216 LN:172294 @SQ SN:chrUn_gl000217 LN:172149 @SQ SN:chrUn_gl000218 LN:161147 @SQ SN:chrUn_gl000219 LN:179198 @SQ SN:chrUn_gl000220 LN:161802 @SQ SN:chrUn_gl000221 LN:155397 @SQ SN:chrUn_gl000222 LN:186861 @SQ SN:chrUn_gl000223 LN:180455 @SQ SN:chrUn_gl000224 LN:179693 @SQ SN:chrUn_gl000225 LN:211173 @SQ SN:chrUn_gl000226 LN:15008 @SQ SN:chrUn_gl000227 LN:128374 @SQ SN:chrUn_gl000228 LN:129120 @SQ SN:chrUn_gl000229 LN:19913 @SQ SN:chrUn_gl000230 LN:43691 @SQ SN:chrUn_gl000231 LN:27386 @SQ SN:chrUn_gl000232 LN:40652 @SQ SN:chrUn_gl000233 LN:45941 @SQ SN:chrUn_gl000234 LN:40531 @SQ SN:chrUn_gl000235 LN:34474 @SQ SN:chrUn_gl000236 LN:41934 @SQ SN:chrUn_gl000237 LN:45867 @SQ SN:chrUn_gl000238 LN:39939 @SQ SN:chrUn_gl000239 LN:33824 @SQ SN:chrUn_gl000240 LN:41933 @SQ SN:chrUn_gl000241 LN:42152 @SQ SN:chrUn_gl000242 LN:43523 @SQ SN:chrUn_gl000243 LN:43341 @SQ SN:chrUn_gl000244 LN:39929 @SQ SN:chrUn_gl000245 LN:36651 @SQ SN:chrUn_gl000246 LN:38154 @SQ SN:chrUn_gl000247 LN:36422 @SQ SN:chrUn_gl000248 LN:39786 @SQ SN:chrUn_gl000249 LN:38502 @RG ID:2675 SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep PL:Illumina @RG ID:2683 SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep_corrected PL:Illumina @RG ID:2696 SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep PL:Illumina @RG ID:825 SM:0A4I0W_Tumor LB:TARGET-40-0A4I0W-01A-01D-LibPrep PL:Illumina @PG ID:GATK IndelRealigner VN:3.6-0-g89b7209 CL:knownAlleles=[(RodBinding name=knownAlleles source=/data/CCRBioinfo/zhujack/Ref/hg19/1000G_phase1.indels.hg19.vcf), (RodBinding name=knownAlleles2 source=/data/CCRBioinfo/zhujack/Ref/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf)] targetIntervals=/lscratch/31973919/realignment.intervals LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=500000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null @PG ID:bwa PN:bwa VN:0.7.15-r1140 CL:bwa mem -M -R @RG SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep PL:Illumina -t 32 /data/CCRBioinfo/public/serpentine_resources/mapping/bwaindex_0.7.15/ucsc.hg19 /lscratch/31952976/5_1_64H0FAAXX.293_BUSTARD-2011-11-19.fq.gz /lscratch/31952976/5_3_64H0FAAXX.293_BUSTARD-2011-11-19.fq.gz @PG ID:bwa.1 PN:bwa VN:0.7.15-r1140 CL:bwa mem -M -R @RG SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep_corrected PL:Illumina -t 32 /data/CCRBioinfo/public/serpentine_resources/mapping/bwaindex_0.7.15/ucsc.hg19 /lscratch/31952895/1_1_70CTCAAXX.278_BUSTARD-2011-09-10.fq.gz /lscratch/31952895/1_2_70CTCAAXX.278_BUSTARD-2011-09-10.fq.gz @PG ID:bwa.2 PN:bwa VN:0.7.15-r1140 CL:bwa mem -M -R @RG SM:0A4I0W_Tumor LB:TARGET-40-0A4I0W-01A-01D-LibPrep PL:Illumina -t 32 /data/CCRBioinfo/public/serpentine_resources/mapping/bwaindex_0.7.15/ucsc.hg19 /lscratch/31953249/TACAAG_3_1_AC0A7FACXX.297_BUSTARD-2011-12-15.fq.gz /lscratch/31953249/TACAAG_3_3_AC0A7FACXX.297_BUSTARD-2011-12-15.fq.gz @PG ID:bwa.3 PN:bwa VN:0.7.15-r1140 CL:bwa mem -M -R @RG SM:0A4I0W_Tumor LB:IHRT2536_TumourDNA_LibPrep PL:Illumina -t 32 /data/CCRBioinfo/public/serpentine_resources/mapping/bwaindex_0.7.15/ucsc.hg19 /lscratch/31952936/2_1_70LKTAAXX.273_BUSTARD-2011-09-03.fq.gz /lscratch/31952936/2_2_70LKTAAXX.273_BUSTARD-2011-09-03.fq.gz @PG ID:GATK PrintReads VN:3.6-0-g89b7209 CL:readGroup=null platform=null number=-1 sample_file=[] sample_name=[] simplify=false no_pg_tag=false @PG ID:MarkDuplicates VN:2.1.1(6a5237c0f295ddce209ee3a3a5b83a3779408b1b_1457101272) CL:picard.sam.markduplicates.MarkDuplicates INPUT=[bam/0A4I0W_Tumor.realigned.bam] OUTPUT=/lscratch/31983274/realigned.md.bam METRICS_FILE=bam/0A4I0W_Tumor.realigned.md.bam.dupmetrics REMOVE_DUPLICATES=false ASSUME_SORTED=true TMP_DIR=[/lscratch/31983274] VALIDATION_STRINGENCY=SILENT MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=<optimized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json PN:MarkDuplicates

Thanks for the help

AndyMenzies commented 6 years ago

Hi

If you take a look at the err file

/scratch/wangyong/all/0A4I3S_ASCAT/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.err

It should give you the underlying error that caused the process to crash.

Could you also check the chromosome names are consistent between the different input files?

genome.fa SnpGcCorrections.tsv 0A4I0W_Tumor.realigned.md.bam 0A4I0W_Normal.realigned.md.bam

All 4 of these files should be using the same names for the same chromosomes. Its possible your bam files have 'chr' prefixes, but the genome.fa and SnpGcCorrections.tsv don't.

Andy

gbnci commented 6 years ago

Hi, Andy: Thanks for your suggestion and I will check it and let you know later. Thanks again Yonghong

From: AndyMenzies notifications@github.com Reply-To: cancerit/ascatNgs reply@reply.github.com Date: Thursday, August 2, 2018 at 12:02 PM To: cancerit/ascatNgs ascatNgs@noreply.github.com Cc: "Wang, Yonghong (NIH/NCI) [E]" wangyong@mail.nih.gov, Author author@noreply.github.com Subject: Re: [cancerit/ascatNgs] error while running ascatngs (#80)

Hi

If you take a look at the err file

/scratch/wangyong/all/0A4I3S_ASCAT/tmpAscat/logs/Sanger_CGP_Ascat_Implement_ascat.0.err

It should give you the underlying error that caused the process to crash.

Could you also check the chromosome names are consistent between the different input files?

genome.fa SnpGcCorrections.tsv 0A4I0W_Tumor.realigned.md.bam 0A4I0W_Normal.realigned.md.bam

All 4 of these files should be using the same names for the same chromosomes. Its possible your bam files have 'chr' prefixes, but the genome.fa and SnpGcCorrections.tsv don't.

Andy

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/cancerit/ascatNgs/issues/80#issuecomment-409977631, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Aj2o-LARhMt1gqsPxAVKQo2ZahWgeyRvks5uMyI_gaJpZM4VsfeB.

keiranmraine commented 6 years ago

The BAM file headers show that the chromosomes have a chr prefix.

Also I note that the @XX tags of the core headers are lowercase, this is not compliant with section 1.3 of the SAM specification: https://samtools.github.io/hts-specs/SAMv1.pdf

gbnci commented 6 years ago

Thanks for the update. I did notice that the chromosome for both genome and GC correction file were not consistent with those in bam files, after making changes, I did get some results. I will see whether the tags can also affect the result. Thanks for the suggestions.

From: Keiran Raine notifications@github.com Reply-To: cancerit/ascatNgs reply@reply.github.com Date: Monday, August 6, 2018 at 11:13 AM To: cancerit/ascatNgs ascatNgs@noreply.github.com Cc: "Wang, Yonghong (NIH/NCI) [E]" wangyong@mail.nih.gov, Author author@noreply.github.com Subject: Re: [cancerit/ascatNgs] error while running ascatngs (#80)

The BAM file headers show that the chromosomes have a chr prefix.

Also I note that the @XX tags of the core headers are lowercase, this is not compliant with section 1.3 of the SAM specification: https://samtools.github.io/hts-specs/SAMv1.pdf

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/cancerit/ascatNgs/issues/80#issuecomment-410742127, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Aj2o-FALql6BJAAp7BQ3HnAyH9cpQRBuks5uOFyGgaJpZM4VsfeB.