Closed Fer020707 closed 2 years ago
For the reference file ucsc.hg19.fasta
, make there is also ucsc.hg19.dict
in the same directory: picard CreateSequenceDictionary.
Thanks for your reply. When placing them in the same directory, he was able to run the mutect2 and vardict scripts, however I got an error when running lofreq
Error message:
FATAL(lofreq_call.c|main_call:1293): Cowardly refusing to overwrite file '/9a7554e99a3447159943ca6a6d61ce78/PruebaVCF/LoFreq.vcf'. Exiting...
INFO 2021-12-07 10:47:38,418 run_script FINISHED RUNNING PruebaVCF/logs/lofreq.2021.12.07.10.04.45.214.cmd in 3.653 seconds with an exit code of 1.
INFO 2021-12-07 10:47:38,441 run_script bash PruebaVCF/logs/strelka.2021.12.07.10.04.45.214.cmd
Start at 2021/12/07 10:47:38
[E::hts_idx_push] Chromosome blocks not continuous
tbx_index_build failed: /bb2f0cfafb4a4aca82b3254c59a9eda6/PruebaVCF/QIAGEN_NGHS-013X-Covered-modificado.bed.gz
INFO 2021-12-07 10:47:45,939 run_script FINISHED RUNNING PruebaVCF/logs/strelka.2021.12.07.10.04.45.214.cmd in 7.498 seconds with an exit code of 1.
INFO 2021-12-07 10:47:46,369 run_script bash PruebaVCF/SomaticSeq/logs/somaticSeq.2021.12.07.10.04.45.214.cmd
Start at 2021/12/07 10:47:46
INFO 2021-12-07 16:47:50,531 SomaticSeq SomaticSeq Input Arguments: output_directory=/937888568af14a718deb7ed8118def62/PruebaVCF/SomaticSeq, genome_reference=/07b76ad939fd4e7581aabfad8dbe0c9b/ucsc.hg19.fasta, truth_snv=None, truth_indel=None, classifier_snv=None, classifier_indel=None, pass_threshold=0.5, lowqual_threshold=0.1, algorithm=xgboost, homozygous_threshold=0.85, heterozygous_threshold=0.01, minimum_mapping_quality=1, minimum_base_quality=5, minimum_num_callers=0.5, dbsnp_vcf=None, cosmic_vcf=None, inclusion_region=/598a5e03ee5c441d9da37a55f2b54881/QIAGEN_NGHS-013X-Covered-modificado.bed, exclusion_region=None, threads=1, somaticseq_train=False, seed=0, tree_depth=12, iterations=None, features_excluded=[], keep_intermediates=False, bam_file=/1af2c713e6f4492fb973d8fea1330fdb/IMSS_111_CKDL190141429-1a-DY0088-AK1680_H55VHBBXXL5.fastq.trimmed.bam, sample_name=TUMOR, mutect_vcf=None, mutect2_vcf=/937888568af14a718deb7ed8118def62/PruebaVCF/MuTect2.vcf, varscan_vcf=None, vardict_vcf=/937888568af14a718deb7ed8118def62/PruebaVCF/VarDict.vcf, lofreq_vcf=/937888568af14a718deb7ed8118def62/PruebaVCF/LoFreq.vcf, scalpel_vcf=/937888568af14a718deb7ed8118def62/PruebaVCF/Scalpel.vcf, strelka_vcf=/937888568af14a718deb7ed8118def62/PruebaVCF/Strelka/results/variants/variants.vcf.gz, which=single
/bin/sh: 1: cannot create /937888568af14a718deb7ed8118def62/PruebaVCF/Strelka/results/variants/variants.vcfc6e1aa5895194b7f9ebc14c8fe352472.gz: Directory nonexistent
Error: Unable to open file /937888568af14a718deb7ed8118def62/PruebaVCF/Strelka/results/variants/variants.vcf.gz. Exiting.
Traceback (most recent call last):
File "/opt/somaticseq/somaticseq/run_somaticseq.py", line 457, in
From the first error message, FATAL(lofreq_call.c|main_call:1293): Cowardly refusing to overwrite file '/9a7554e99a3447159943ca6a6d61ce78/PruebaVCF/LoFreq.vcf'. Exiting...
You need to delete LoFreq.vcf files because LoFreq will not overwrite a vcf file that's already there.
Later:
tbx_index_build failed: /bb2f0cfafb4a4aca82b3254c59a9eda6/PruebaVCF/QIAGEN_NGHS-013X-Covered-modificado.bed.gz
Maybe you want to make sure the input bed file is ordered. You can order it by vcfsorter.pl hg19.dict modificado.bed > modificado_ordered.bed, and use that ordered bed file as input and see what happens.
Also make sure bedtools is installed such that intersectBed
command is in the path.
Thank you very much for your reply. When removing the bed file I got an error when running Strelka:
Error message: [2021-12-08T06:45:03.557825Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] Failed to complete sub-workflow task: 'EstimateSeqErrorParams+Sample000' launched from sub-workflow 'EstimateSeqErrorParams', failed sub-workflow classname: 'EstimateSequenceErrorWorkflowForSample' [2021-12-08T06:45:03.557989Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] [EstimateSeqErrorParams+Sample000] Error Message: [2021-12-08T06:45:03.558028Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] [EstimateSeqErrorParams+Sample000] Unhandled Exception in TaskRunner-Thread-EstimateSeqErrorParams+Sample000 [2021-12-08T06:45:03.558111Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] [EstimateSeqErrorParams+Sample000] Traceback (most recent call last): [2021-12-08T06:45:03.558329Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] [EstimateSeqErrorParams+Sample000] File "/opt/strelka/lib/python/pyflow/pyflow.py", line 1069, in run [2021-12-08T06:45:03.558366Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] [EstimateSeqErrorParams+Sample000] (retval, retmsg) = self._run() [2021-12-08T06:45:03.558398Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] [EstimateSeqErrorParams+Sample000] File "/opt/strelka/lib/python/pyflow/pyflow.py", line 1121, in _run [2021-12-08T06:45:03.558429Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] [EstimateSeqErrorParams+Sample000] self.workflow.workflow() [2021-12-08T06:45:03.558575Z] [d9f51a288a39] [1_1] [TaskManager] [ERROR] [EstimateSeqErrorParams+Sample000] File "/opt/strelka/lib/python/strelkaSequenceErrorEstimation.py", line 426, in workflow ... [2021-12-08T06:45:11.567268Z] [d9f51a288a39] [1_1] [WorkflowRunner] [ERROR] [EstimateSeqErrorParams+Sample000] raise Exception("Task memory requirement exceeds full available resources") [2021-12-08T06:45:11.567359Z] [d9f51a288a39] [1_1] [WorkflowRunner] [ERROR] [EstimateSeqErrorParams+Sample000] Exception: Task memory requirement exceeds full available resources [2021-12-08T06:45:11.567428Z] [d9f51a288a39] [1_1] [WorkflowRunner] [ERROR] Failed to complete sub-workflow task: 'EstimateSeqErrorParams' launched from master workflow, failed sub-workflow classname: 'EstimateSequenceErrorWorkflow'
Thanks
You don't have to remove the .bed file. Just make sure it is sorted. Also, how much memory does your computer have?
[2021-12-08T06:45:11.567268Z] [d9f51a288a39] [1_1] [WorkflowRunner] [ERROR] [EstimateSeqErrorParams+Sample000] raise Exception("Task memory requirement exceeds full available resources")
4 GB RAM and 250 GB ROM. It will be enough? or would it be better to install it on another computer?
Yeah 4gb may be too little. I'd say at least 16gb for small analyses, and more if you're doing whole genome and needs more parallel threadd.
When correcting the bed file and the ram memory (to 100 GB of RAM) everything works perfectly. Thank you very much for all your answers.
Hi I am trying to run makeSomaticScripts.py in a conda environment, but it only generates the .cmd scripts. In advance, I would appreciate your response, thank you.
command: makeSomaticScripts.py single --bam IMSS_111_CKDL190141429-1a-DY0088-AK1680_H55VHBBXXL5.fastq.trimmed.bam --genome-reference /home/fer/Documents/ref/ucsc.hg19.fasta --output-directory PruebaVCF --dbsnp-vcf dbsnp/dbsnp_138.hg19.vcf --inclusion-region QIAGEN_NGHS-013X-Covered-modificado.bed --threads 1 --run-mutect2 --run-vardict --run-lofreq --run-scalpel --run-strelka2 --run-somaticseq --run-workflow
Error message: PruebaVCF/logs/mutect2.2021.12.07.05.03.18.584.cmd Traceback (most recent call last): File "/home/fer/anaconda3/envs/SomaticSeq/bin/makeSomaticScripts.py", line 4, in
import('pkg_resources').run_script('SomaticSeq==3.6.3', 'makeSomaticScripts.py')
File "/home/fer/anaconda3/envs/SomaticSeq/lib/python3.10/site-packages/pkg_resources/init.py", line 651, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/fer/anaconda3/envs/SomaticSeq/lib/python3.10/site-packages/pkg_resources/init.py", line 1448, in run_script
exec(code, namespace, namespace)
File "/home/fer/anaconda3/envs/SomaticSeq/lib/python3.10/site-packages/SomaticSeq-3.6.3-py3.10.egg/EGG-INFO/scripts/makeSomaticScripts.py", line 493, in
make_workflow( args, workflowArguments )
File "/home/fer/anaconda3/envs/SomaticSeq/lib/python3.10/site-packages/SomaticSeq-3.6.3-py3.10.egg/EGG-INFO/scripts/makeSomaticScripts.py", line 398, in make_workflow
scalpel_job = Scalpel.tumor_only( input_arguments, args.container_tech )
File "/home/fer/anaconda3/envs/SomaticSeq/lib/python3.10/site-packages/SomaticSeq-3.6.3-py3.10.egg/somaticseq/utilities/dockered_pipelines/somatic_mutations/Scalpel.py", line 118, in tumor_only
assert os.path.exists( input_parameters['reference_dict'] )
AssertionError
versions: python 3.10.0 conda 4.10.3 bedtools v2.30.0 Docker 10.20.11 R 4.1.2
REPOSITORY TAG IMAGE ID CREATED SIZE broadinstitute/gatk latest 88e2886f9e27 4 weeks ago 4.5GB lethalfang/somaticseq latest 05bc8973a566 4 months ago 1.92GB lethalfang/strelka 2.9.10 a1e636617459 18 months ago 300MB lethalfang/vardictjava 1.7.0 5e281eb80bc4 19 months ago 851MB lethalfang/jointsnvmix2 0.7.5 503df7957382 3 years ago 484MB lethalfang/scalpel 0.5.4 cca8678e328b 3 years ago 527MB lethalfang/lofreq 2.1.3.1-1 21c2cd913130 3 years ago 550MB lethalfang/somaticsniper 1.0.5.0-2 55c8228fb895 4 years ago 465MB marghoob/muse 1.0rc_c 5e40e5758410 5 years ago 132MB djordjeklisic/sbg-varscan2 v1 0a3d079b6bc9 6 years ago 1.17GB