bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
986 stars 354 forks source link

No such file or directory error when not specifying variant_regions #2572

Closed paurods closed 5 years ago

paurods commented 5 years ago

This is my config file:

`details:

As you can see, I haven't specified variant_regions expecting the usage of the "callable" regions of the bam.

Then, after alignment, once calling starts, I get this error: OSError: [Errno 2] No such file or directory: '/media/prodriguez/disk1/data/amontaner/Prova_Amuntaner/prova_analisi/analysis/NIPT1_S4/work/bedprep/NIPT1_S4_cfdna-sort-callable_sample-merged.bed.gz.tbi'

I check if this file exists and it does exist... Then I rerun the analysis and it continues and finishes without problems...

It happened to me with two independent analysis in which I haven't specified the variant_regions.

When I look for the error in the log files I cannot find it...

This are the last things happening before the error:

[2018-11-12T19:44Z] tabix index somatic.indels-fixed.vcf.gz [2018-11-12T19:44Z] Resource requests: ; memory: 1.00; cores: 1 [2018-11-12T19:44Z] Configuring 1 jobs to run, using 1 cores each with 1.00g of memory reserved for each job [2018-11-12T19:44Z] Combine variant files [2018-11-12T19:44Z] 20:44:25.422 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/media/prodriguez/disk1/Software/bcbio/anaconda/share/picard-2.18.11-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so [2018-11-12T19:44Z] [Mon Nov 12 20:44:25 CET 2018] MergeVcfs INPUT=[/media/prodriguez/disk1/data/amontaner/Prova_Amuntaner/prova_analisi/analysis/NIPT1_S4/work/strelka2/chr1/NIPT1_S4-chr1_109821356_142543476-work/results/variants/somatic.snvs-fixed.vcf.gz, /media/prodriguez/disk1/data/amontaner/Prova_Amuntaner/prova_analisi/analysis/NIPT1_S4/work/strelka2/chr1/NIPT1_S4-chr1_109821356_142543476-work/results/variants/somatic.indels-fixed.vcf.gz] OUTPUT=/media/prodriguez/disk1/data/amontaner/Prova_Amuntaner/prova_analisi/analysis/NIPT1_S4/work/bcbiotx/tmpjmJYXP/NIPT1_S4-chr1_109821356_142543476-raw.vcf.gz SEQUENCE_DICTIONARY=/media/prodriguez/disk1/Software/bcbio/genomes/Hsapiens/hg19/seq/hg19.dict VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=true CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false [2018-11-12T19:44Z] [Mon Nov 12 20:44:25 CET 2018] Executing as amontaner@QGENLAB024 on Linux 4.15.0-36-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_144-b01; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.18.11-SNAPSHOT [2018-11-12T19:44Z] [Mon Nov 12 20:44:25 CET 2018] picard.vcf.MergeVcfs done. Elapsed time: 0.00 minutes. [2018-11-12T19:44Z] Runtime.totalMemory()=760217600 [2018-11-12T19:44Z] Filtering Strelka2 calls with allele fraction threshold of 0.1 [2018-11-12T19:44Z] bgzip NIPT1_S4-chr1_109821356_142543476.vcf [2018-11-12T19:44Z] tabix index NIPT1_S4-chr1_109821356_142543476.vcf.gz

Which doesn't seem to be related with the error....

Although finally the analysis successfully ends, it is annoying the fact of having to relaunch it. Do you have any idea of what is happening??

chapmanb commented 5 years ago

Thanks for the question and apologies about the issue. There were some race conditions in bcbio where multiple processes are simultaneously creating tabix index files where we'd see similar errors. What version of bcbio are you running? Practically, from the config you specified I'm not sure why you'd see that problem here since it should only be creating this file in single process, but maybe I'm missing some of the name conflicts.

The other issue that can happen on shared filesystems is that files can be slow to become available when the system is under load. If we're running a recent version of bcbio, then this is my likely guess and why you'd see the files present when looking at the filesystem later. This is harder to deal with from within bcbio, but if you could share the full traceback of the error we might be able to identify a way to wait or otherwise recover from the missing file.

Thanks again and hope this helps.

roryk commented 5 years ago

Closing as there hasn't been any followup.