harvardinformatics / snpArcher

Snakemake workflow for highly parallel variant calling designed for ease-of-use in non-model organisms.
MIT License
63 stars 30 forks source link

Error in rule concat_gvcfs #102

Closed ediamant closed 3 months ago

ediamant commented 1 year ago

Hello! Thank you for the help prior. I'm relatively new to bioinformatics. I am assembling 66 genomes of Dark-eyed Juncos and everything was going well until the 35th genome. I made sure that all those files listed in the input were there for those two birds (2591-24071 and 2591-24073) are there (which they are) and that there is memory available (which there is). Any guidance would be helpful. Thank you very much for your time! Here's the log and the error:

Activating conda environment: .snakemake/conda/45afd75cf38b3a515c78e0d58256e283_ Writing to /work/6777787.1.pod_smp.q Checking the headers and starting positions of 50 files Cleaning [E::hts_open_format] Failed to open file "/work/6777787.1.pod_smp.q/00006.bcf" : No such file or directory Cannot write /work/6777787.1.pod_smp.q/00006.bcf: No such file or directory Cleaning Done [Sun Apr 2 21:09:26 2023] Error in rule concat_gvcfs: jobid: 2670 input: results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0000.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0001.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0002.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0003.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0004.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0005.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0006.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0007.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0008.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0009.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0010.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0011.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0012.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0013.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0014.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0015.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0016.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0017.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0018.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0019.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0020.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0021.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0022.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0023.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0024.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0025.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0026.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0027.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0028.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0029.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0030.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0031.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0032.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0033.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0034.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0035.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0036.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0037.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0038.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0039.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0040.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0041.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0042.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0043.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0044.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0045.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0046.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0047.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0048.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0049.raw.g.vcf.gz, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0000.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0001.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0002.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0003.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0004.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0005.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0006.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0007.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0008.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0009.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0010.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0011.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0012.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0013.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0014.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0015.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0016.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0017.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0018.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0019.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0020.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0021.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0022.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0023.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0024.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0025.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0026.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0027.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0028.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0029.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0030.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0031.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0032.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0033.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0034.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0035.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0036.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0037.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0038.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0039.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0040.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0041.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0042.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0043.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0044.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0045.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0046.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0047.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0048.raw.g.vcf.gz.tbi, results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0049.raw.g.vcf.gz.tbi output: results/GCA_003829775.2/gvcfs/USGS_2591_24071.g.vcf.gz, results/GCA_003829775.2/gvcfs/USGS_2591_24071.g.vcf.gz.tbi log: logs/GCA_003829775.2/concat_gvcfs/USGS_259124071.txt (check log file(s) for error details) conda-env: /u/project/pamelaye/eldiaman/Juncos/ElliesProjects/snpArcher/.snakemake/conda/45afd75cf38b3a515c78e0d58256e283 shell:

    bcftools concat -D -a -Ou results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0000.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0001.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0002.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0003.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0004.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0005.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0006.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0007.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0008.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0009.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0010.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0011.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0012.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0013.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0014.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0015.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0016.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0017.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0018.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0019.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0020.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0021.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0022.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0023.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0024.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0025.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0026.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0027.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0028.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0029.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0030.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0031.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0032.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0033.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0034.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0035.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0036.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0037.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0038.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0039.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0040.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0041.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0042.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0043.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0044.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0045.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0046.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0047.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0048.raw.g.vcf.gz results/GCA_003829775.2/interval_gvcfs/USGS_2591_24071/0049.raw.g.vcf.gz | bcftools sort -T /work/6777787.1.pod_smp.q -Oz -o results/GCA_003829775.2/gvcfs/USGS_2591_24071.g.vcf.gz -
    tabix -p vcf results/GCA_003829775.2/gvcfs/USGS_2591_24071.g.vcf.gz

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
tsackton commented 1 year ago

This can often result because of lack of temporary space. Do you have the "big temp" option in config set? If not, I would try setting that to a directory that has a lot of free space.

ediamant commented 1 year ago

Thank you! I'm trying that now.

On Thu, Apr 20, 2023 at 12:14 PM Tim Sackton @.***> wrote:

This can often result because of lack of temporary space. Do you have the "big temp" option in config set? If not, I would try setting that to a directory that has a lot of free space.

— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/102#issuecomment-1516808649, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHUKL6SHA3R57TPKIHIW6K3XCGCB5ANCNFSM6AAAAAAXDATLYY . You are receiving this because you authored the thread.Message ID: @.***>

-- Ellie Diamant Ph.D. Candidate Yeh Lab https://faculty.eeb.ucla.edu/Yeh/, Ecology and Evolutionary Biology, UCLA Associate Director, Counterforce Lab https://counterforcelab.org/, Design | Media Arts, UCLA she/her elliediamant.wordpress.com

cademirch commented 10 months ago

Hi @ediamant, just wanted to check if you had resolved this?

ediamant commented 10 months ago

Hi,Yes I had! Apologies - I thought I had updated. Thank you for checking.Best,EllieOn Aug 25, 2023, at 8:12 AM, Cade Mirchandani @.***> wrote: Hi @ediamant, just wanted to check if you had resolved this?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>