harvardinformatics / snpArcher

Snakemake workflow for highly parallel variant calling designed for ease-of-use in non-model organisms.
MIT License
66 stars 31 forks source link

Error in concat_gvcfs #214

Open exkahn opened 1 month ago

exkahn commented 1 month ago

Hi, I've been trying to run snpArcher on genomic resequencing data, but I've been running into an error with the concat_gvcfs step.

The command I'm using is

snakemake --use-conda --cores 16

The end of the .snakemake/log file lists an error that looks like this:

Activating conda environment: .snakemake/conda/b78e203ca0a7037a682091b03d0f69dd_ [Tue Jul 30 16:00:53 2024] Error in rule concat_gvcfs: jobid: 14 input: results/GCA_026437355.1/interval_gvcfs/EucyKris_LasFloCk_90_En_187/0000.raw.g.vcf.gz, results/GCA_026437355.1/interval_gvcfs/EucyKris_LasFloCk_90_En_187/0001.raw.g.vcf.gz, results/GCA_026437355.1/interval_gvcfs/EucyKris_LasFloCk_90_En_187/0002.raw.g.vcf.gz, results/GCA_026437355.1/interval_gvcfs/EucyKris_LasFloCk_90_En_187/0003.raw.g.vcf.gz, [...more .raw.g.vcf.gz files to 0021...] results/GCA_026437355.1/interval_gvcfs/EucyKris_LasFloCk_90_En_187/0000.raw.g.vcf.gz.tbi, results/GCA_026437355.1/interval_gvcfs/EucyKris_LasFloCk_90_En_187/0001.raw.g.vcf.gz.tbi, [...more .raw.g.vcf.gz.tbi files to 0021] log: logs/GCA_026437355.1/concat_gvcfs/EucyKris_LasFloCk_90_En_187.txt (check log file(s) for error details) conda-env: /data/home/miabrecht/CCGPTidewaterGoby/snpArcher/.snakemake/conda/b78e203ca0a7037a682091b03d0f69dd shell:

This follows several identical steps with different samples which seem successful. The log files in logs/concat_gvcfs are all empty .txt files. Here is my config file, converted to .txt: config.txt, and I haven't been using a profile yaml file. I'd appreciate any help, thanks!

cademirch commented 1 month ago

Hi @exkahn, thanks for opening an issue. If I understand, other the concat_gvcfs step works for other samples?

exkahn commented 1 month ago

Yes, kind of -- it worked for the first three samples. However, when I took out EucyKris_LasFloCk_90_En_187 (the sample where the error is happening), two different samples caused the same error.

tsackton commented 1 month ago

Hmm. This is a little tricky to debug without any error logs, unfortunately. One common problem with concat steps is running out of temp space. You might try setting bigtmp in the config to a local directory that you know has lots of space and see if that helps.

cademirch commented 3 weeks ago

Hi @exkahn, we've updated snpArcher to better capture the logs from these rules. If you pull the latest changes and rerun that may give us some more info to diagnose.