bioinform / somaticseq

An ensemble approach to accurately detect somatic mutations using SomaticSeq
http://bioinform.github.io/somaticseq/
BSD 2-Clause "Simplified" License
189 stars 53 forks source link

Somaticseq makeSomaticScripts.py running and output issues #117

Closed csreej27 closed 1 year ago

csreej27 commented 1 year ago

Hi I am trying to run SomaticSeq using the following command : python3 /home/ec2-user/anaconda3/envs/ss_test/bin/makeSomaticScripts.py paired --normal-bam /home/ec2-user/efs/sridhar/Somatic_Testi ng/RCC_OP45_N.realign.recal.bam --tumor-bam /home/ec2-user/efs/sridhar/Somatic_Testing/RCC_45-T.realign.recal.bam --genome-reference /home/ec2-user/efs/sridhar/sreejita/somatic_test_files/hg38.fa --output-directory results1 --dbsnp-vcf /home/ec2-user/efs/sridhar/g ouri/dbsnp_vcf/retry/sorted_Homo_sapiens_assembly38.dbsnp138.vcf --threads 20 --container-tech docker --run-mutect2 --run-vardict - -run-muse --run-lofreq --run-strelka2 --run-somaticseq --run-somaticsniper --run-varscan2 --run-workflow

But as can be seeing from the output directory structure below - The Consensus tsvs are not getting generated image

And in the end I am getting this error : image

Could you please help me figure out what is going wrong? I have tried running the same script with different samples and different dbsnp vcfs, but end up with same error every time

For reference - these are WES samples , the normal and tumour BAMS are ~4BG each in size, Server configuration : 64GB RAM and 16CPU

litaifang commented 1 year ago

Can you save the log 2> log.stderr > log.stdout, and see which tool/command failed. You can try to find out cat log.stderr | grep 'FINISHED RUNNING' and see if some of them did not finish properly then we can try to figure out what's happening.

csreej27 commented 1 year ago

So , since I was previously also getting dnsnp vcf not sorted properly error , I ran above script with another set of BAM files aligned with a reference fasta file sorted as chr 1, chr2, ch3 ... chr 20 ( as opposed to 1, 10, 11...20 ) , and the corresponding dbsnp vcf also sorted in the same order. ( I obtained the fasta and the vcf from the gatk resource bundle )

python3 /home/ec2-user/anaconda3/envs/ss_test/bin/makeSomaticScripts.py paired --normal-bam /home/ec2-user/efs/sridhar/sreejita/gatk_resources/gatk_ou
tput/NCC2-BL.realign.recal.bam --tumor-bam /home/ec2-user/efs/sridhar/sreejita/gatk_resources/gatk_output/NCC2-T.realign.recal.bam --genome-reference
/home/ec2-user/efs/sridhar/sreejita/gatk_resources/Homo_sapiens_assembly38.fasta --output-directory sorted_results --dbsnp-vcf /home/ec2-user/efs/srid
har/sreejita/gatk_resources/Homo_sapiens_assembly38.dbsnp138.vcf --threads 20 --container-tech  docker --run-mutect2 --run-vardict --run-muse --run-lo
freq --run-strelka2 --run-somaticseq --run-somaticsniper --run-varscan2 --run-workflow

As for the tools which didn't work - image As per the above screenshot looks like mutect2 failed in the last round ( I had given 20 threads - so while creating teh 20th folder it fails, not before that )

In the log file, Thereafter, Somaticseq fails in entirely :

image

I noticed these errors on inspecting the log file further : image

Additionally, after inspecting subdirectories 1-20 in the results folder I noticed that in the subdirectories 1-19, since Mutect2 ran successfully all of these files were generated - MuTect2.vcf, MuTect2.vcf.filteringStats.tsv, MuTect2.vcf.idx, unfiltered.MuTect2.vcf,unfiltered.MuTect2.vcf.idx, unfiltered.MuTect2.vcf.stats

Whereas in the 20th subdirectory - only these were generated : unfiltered.MuTect2.vcf , unfiltered.MuTect2.vcf.idx

litaifang commented 1 year ago

Do all mutect2 jobs fail, or just the 20th one? What if you just execute the failed mutect2 job, e.g., bash sorted_results/20/logs/mutect2.*.cmd, and see if that task completes successfully. Maybe monitor the memory usage when it is being executed, etc. Mutect2, being a GATK tool, has very strict requirement for the bam files, i.e., the SM field for the tumor and normal bam files in the header must be different.

csreej27 commented 1 year ago

Thanks for your response.

Just the 20th mutect2 job failed

I will follow your suggestions ( running the failed mutect2 job , looking into the BAM headers ) and get back to you on this .

csreej27 commented 1 year ago

Update : I tried executing the failed mutect2 job - and this is the error I got - image

Then I proceeded onto fixing the BAM headers such that the SM fields are different for tumour and normal bam files. And then ran the following script in a server with 124GB RAM and 32 CPU :

python3 /home/ubuntu/anaconda3/envs/my_env/bin/makeSomaticScripts.py paired --normal-bam /home/ubuntu/efs/sridhar/sreejita/gatk_resources/gatk_output/new_NCC2-BL.realign.recal.bam --tumor-bam /home/ubuntu/efs/sridhar/sreejita/gatk_resources/gatk_output/new_NCC2-T.realign.recal.bam --genome-reference /home/ubuntu/efs/sridhar/sreejita/gatk_resources/genome_reference/Homo_sapiens_assembly38.fasta --output-directory ubuntu_test --dbsnp-vcf /home/ubuntu/efs/sridhar/sreejita/gatk_resources/genome_reference/Homo_sapiens_assembly38.dbsnp138.vcf --threads 10 --container-tech  docker --run-mutect2 --run-vardict --run-muse --run-lofreq --run-strelka2 --run-somaticseq --run-somaticsniper --run-varscan2 --run-workflow

Which gave the following errors ( muse had exit code for threads 2 and 4 , mutect2 had exit code of 1 for 10th thread ) of 139 image

image

Thereafter, when I tried running the 10th mutect2 failed job I got the Out of Memory error again , although I was monitoring memory usage and memory was not exhausted at that point : image

Where do you think the issue is possibly arising from at this point given that I am now using a 124GB RAM and 32 CPU server . Is there any input file preprocessing step that I might be missing ?

I have already made sure that the reference genome fasta folder also contains the dict and fai files, and the dnsnp vcf is bgzipped and tabix and idx files also generated in same folder

litaifang commented 1 year ago

It seems like the allocated memory wasn't enough for this Mutect2 thread. Trying editing that .cmd file to double the java memory allocation and see if it completes.

One suggestion I have is to use --inclusion-region to give it a bed file with only "major" chromosomes, with the region going from 0 to its chromosome length, i.e.,

chr1   0   248956422
chr2   0   242193529
....
chrX   0   156040895
chrY   0    57227415

Past experience tells me that sometimes those decoy, alt, and non-human contigs can take up huge coverages and blow up the memory requirements.

csreej27 commented 1 year ago

So as per your previous suggestions, after supplying the bed file with major chromosomes and using below mentioned command , somaticseq was running without issues and giving correct results.

But recently, using same commands and samples, when the command was run on another server we are getting new set of issues

python3 /tmp/somaticSeq/somaticseq/utilities/dockered_pipelines/makeSomaticScripts.py paired --normal-bam /path/to/normal.bam --tumor-bam /path/to/tumor.bam --genome-reference hg38.fa --inclusion-region human.bed --output-directory somatic_results --dbsnp-vcf new_sorted_hg38.vcf --threads 12 --container-tech docker --run-mutect2 --run-vardict --run-lofreq --run-strelka2 --run-somaticseq --run-somaticsniper --run-varscan2 --run-workflow --by-caller

image

image

I am not understanding why above error is coming suddenly when the other server has been set up in same manner as previous server and has all dependencies / requirements satisfied already

As a result of the above error - all other mutation callers are working but finally, when its coming to somaticseq- it fails

litaifang commented 1 year ago

What if you install somaticseq and just call makeSomaticScripts.py instead of python3 /tmp/somaticSeq/somaticseq/utilities/dockered_pipelines/makeSomaticScripts.py?