nf-cmgg / structural

A bioinformatics best-practice analysis pipeline for calling structural variants (SVs), copy number variants (CNVs) and repeat region expansions (RREs) from short DNA reads
https://nf-cmgg.github.io/structural/
MIT License
18 stars 3 forks source link

smoove makes empty file #96

Open mvheetve opened 1 month ago

mvheetve commented 1 month ago

Description of the bug

Hi,

I had the following error, which stems from an empty smoove.vcf.gz input file. I'm not entirely sure but I believe the reason why this file is empty is because smoove was unable to call any SVs. This is again a sample on a very low coverage WGS experiment. I suggest maybe adding a check to see if smoove.vcf.gz for each sample is not empty and if empty to just make an smoove.vcf.gz file containing a proper header at least.

Regards M

#!/bin/bash -euo pipefail
bcftools \
    sort \
    --output D2321763.smoove.vcf.gz \
    --temp-dir . \
    --output-type z \
    D2321763-smoove.vcf.gz

cat <<-END_VERSIONS > versions.yml
"NFCMGG_STRUCTURAL:STRUCTURAL:BAM_SV_CALLING:BAM_VARIANT_CALLING_SMOOVE:BCFTOOLS_SORT":
    bcftools: $(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*$//')
END_VERSIONS
(base) [vsc43079@gligar07 33c20c782deafa01be0e33539c3474]$ cat .command.lo
cat: .command.lo: No such file or directory
(base) [vsc43079@gligar07 33c20c782deafa01be0e33539c3474]$ cat .command.log
INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
Writing to .5qqIE3
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Could not read VCF/BCF headers from D2321763-smoove.vcf.gz
Cleaning

Command used and terminal output

No response

Relevant files

No response

System information

No response

nvnieuwk commented 1 month ago

Do you still have the smoove logs? I have some test data that also returns an empty VCF file, but this one contains a header. So I want to make sure first that it's not some weird issue with smoove that can be fixed.