PapenfussLab / gridss

GRIDSS: the Genomic Rearrangement IDentification Software Suite
Other
258 stars 71 forks source link

Fatal error: missing assembly is not missing #674

Closed afurches closed 2 months ago

afurches commented 3 months ago

Hi,

When running gridss -s call my job failed after 12 hours with the error below. However, the bam file in question (1909.bam) is NOT missing. Here is the tail of the log:

...
INFO    2024-08-24 01:55:26 ProcessExecutor null device
INFO    2024-08-24 01:55:26 ProcessExecutor           1
[Sat Aug 24 01:55:26 EDT 2024] gridss.analysis.CollectGridssMetrics done. Elapsed time: 2.48 minutes.
Runtime.totalMemory()=13027508224
[Sat Aug 24 01:55:26 EDT 2024] gridss.IdentifyVariants done. Elapsed time: 714.59 minutes.
Runtime.totalMemory()=13027508224
Exception in thread "main" java.lang.IllegalArgumentException: Fatal error: Missing assembly for 1909.bam. All input files must have a corresponding assembly.
    at au.edu.wehi.idsv.AssemblyEvidenceSource.validateAllCategoriesAssembled(AssemblyEvidenceSource.java:544)
    at gridss.cmdline.FullEvidenceCommandLineProgram.getAssemblySource(FullEvidenceCommandLineProgram.java:29)
    at gridss.IdentifyVariants.doWork(IdentifyVariants.java:31)
    at gridss.cmdline.MultipleSamFileCommandLineProgram.doWork(MultipleSamFileCommandLineProgram.java:173)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
    at gridss.IdentifyVariants.main(IdentifyVariants.java:26)

The head of the log file shows that 1909.bam was recognized as an input file:

Fri Aug 23 13:26:15 EDT 2024: Full log file is: ./gridss.full.20240823_132615.andes221.3994430.log
Fri Aug 23 13:26:15 EDT 2024: Found /usr/bin/time
Fri Aug 23 13:26:15 EDT 2024: Using GRIDSS jar /miniforge3_andes/envs/gridss_andes/share/gridss-2.13.2-3/gridss.jar
Fri Aug 23 13:26:15 EDT 2024: Using reference genome "/analyses/gridss/ref.fa"
Fri Aug 23 13:26:15 EDT 2024: Using output VCF /analyses/gridss/gridss-varcall.vcf
Fri Aug 23 13:26:15 EDT 2024: Using assembly bam /analyses/gridss/samtools-merge_merged.assembly.bam
Fri Aug 23 13:26:15 EDT 2024: Using 8 worker threads.
Fri Aug 23 13:26:15 EDT 2024: Using no blacklist bed. The encode DAC blacklist is recommended for hg19.
Fri Aug 23 13:26:15 EDT 2024: Using JVM maximum heap size of 30g for assembly and variant calling.
Fri Aug 23 13:26:16 EDT 2024: Using input file /data/1863.bam
Fri Aug 23 13:26:16 EDT 2024: Using input file /data/1909.bam
Fri Aug 23 13:26:16 EDT 2024: Using input file /data/1950.bam
...

Has anyone encountered this?

Thanks, Anna