moiexpositoalonsolab / grenepipe

A flexible, scalable, and reproducible pipeline to automate variant calling from raw sequence reads, with lots of bells and whistles.
http://grene-net.org
GNU General Public License v3.0
94 stars 21 forks source link

freebayes causes early error about number of threads #17

Closed bensprung closed 2 years ago

bensprung commented 2 years ago

Hi Lucas, got a weird one for you. If I change the caller from hapotypecaller to freebayes, I get the error below. It's doubly strange because it seems to occur well before freebayes would be used in the pipeline.

[Sat Dec 11 11:13:02 2021]
rule samtools_stats:
    input: dedup/111D03-1.bam
    output: qc/samtools-stats/111D03-1.txt
    log: logs/samtools-stats/111D03-1.log
    jobid: 19
    benchmark: benchmarks/samtools-stats/111D03-1.bench.log
    wildcards: sample=111D03, unit=1
    resources: tmpdir=/tmp

[Sat Dec 11 11:13:03 2021]
Finished job 20.
8 of 54 steps (15%) done
Select jobs to execute...
WorkflowError:
Job needs threads=5 but only threads=3 are available. This is likely because two jobs are connected via a pipe and have to run simultaneously. Consider providing more resources (e.g. via --cores).
Activating conda environment: /home/ben/grenepipe_111D03_S288C/.snakemake/conda/76c2600e72d572e62d62105144f7b21f
/usr/bin/bash: qc/samtools-stats/111D03-1.txt: No such file or directory
Traceback (most recent call last):
  File "/home/ben/grenepipe-master-2021Oct11/.snakemake/scripts/tmpfwffbzki.wrapper.py", line 21, in <module>
    shell("samtools stats {extra} {snakemake.input}"
  File "/home/ben/mambaforge/envs/snakemake/lib/python3.9/site-packages/snakemake/shell.py", line 263, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'samtools stats  dedup/111D03-1.bam  > qc/samtools-stats/111D03-1.txt  2> logs/samtools-stats/111D03-1.log' returned non-zero exit status 1.
[Sat Dec 11 11:13:04 2021]
Error in rule samtools_stats:
    jobid: 19
    output: qc/samtools-stats/111D03-1.txt
    log: logs/samtools-stats/111D03-1.log (check log file(s) for error message)
    conda-env: /home/ben/grenepipe_111D03_S288C/.snakemake/conda/76c2600e72d572e62d62105144f7b21f

RuleException:
CalledProcessError in line 98 of /home/ben/grenepipe-master-2021Oct11/rules/qc.smk:
Command 'source /home/ben/mambaforge/bin/activate '/home/ben/grenepipe_111D03_S288C/.snakemake/conda/76c2600e72d572e62d62105144f7b21f'; /home/ben/mambaforge/envs/snakemake/bin/python3.9 /home/ben/grenepipe-master-2021Oct11/.snakemake/scripts/tmpfwffbzki.wrapper.py' returned non-zero exit status 1.
  File "/home/ben/grenepipe-master-2021Oct11/rules/qc.smk", line 98, in __rule_samtools_stats
  File "/home/ben/mambaforge/envs/snakemake/lib/python3.9/concurrent/futures/thread.py", line 52, in run
Waiting at most 5 seconds for missing files.
MissingOutputException in line 177 of /home/ben/grenepipe-master-2021Oct11/rules/qc.smk:
Job Missing files after 5 seconds:
qc/picard/111D03-1.alignment_summary_metrics
qc/picard/111D03-1.base_distribution_by_cycle_metrics
qc/picard/111D03-1.base_distribution_by_cycle.pdf
qc/picard/111D03-1.gc_bias.detail_metrics
qc/picard/111D03-1.gc_bias.summary_metrics
qc/picard/111D03-1.gc_bias.pdf
qc/picard/111D03-1.insert_size_metrics
qc/picard/111D03-1.insert_size_histogram.pdf
qc/picard/111D03-1.quality_by_cycle_metrics
qc/picard/111D03-1.quality_by_cycle.pdf
qc/picard/111D03-1.quality_distribution_metrics
qc/picard/111D03-1.quality_distribution.pdf
qc/picard/111D03-1.quality_yield_metrics
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 22 completed successfully, but some output files are missing. 22
lczech commented 2 years ago

Hi Ben,

interesting. There are several things that might be off here.

Job needs threads=5 but only threads=3 are available. This is likely because two jobs are connected via a pipe and have to run simultaneously. Consider providing more resources (e.g. via --cores).

You might want to try using more cores, to resolve this warning. That is however probably not the cause here.

The problem here seems to be with samtools stats, wich is a quality control tool that is independent of freebayes and GATK HaplotypeCaller, and hence can be executed by the pipeline whenever - it might hence be coincidence that it fails before freebayes is even started, and that you might run into the same problem with HaplotypeCaller as well. Hence: Did the same configuration with just changing the calling tool work for you? Was that with the exact same grenepipe version?

One thing that might help: Could you please post the samtool stats log file here? It should be at logs/samtools-stats/111D03-1.log.

Cheers and so long Lucas

bensprung commented 2 years ago

I'm actually running on a pretty old 4-core machine so I only allocate 3 cores, but yes--it works fine with HaplotypeCaller set instead of freebayes and everything else exactly the same. Unfortunately, the log file logs/samtools-stats/111D03-1.log is not actually created so I can't post it (you can see it complaining that it doesn't exist in the error message above). The logs/samtools-stats directory exists but there are no files in it.

Interestingly, for the runs where HaplotypeCaller is used (with success), that samtools-stats log file exists but it's totally empty.

bensprung commented 2 years ago

I'm back from other projects trying to troubleshoot this. Is there a way, for a particular config.yaml, to get grenepipe to produce a list of the commands that it is going to do, without actually running them? I was hoping to run each step by hand as it were to try to get a more precise sense of what the issue is.

Also, if I allocate another core, I just get Job needs threads=6 but only threads=4 are available.

lczech commented 2 years ago

Hi @bensprung,

thanks for digging into this!

Is there a way, for a particular config.yaml, to get grenepipe to produce a list of the commands that it is going to do, without actually running them?

Kind of. Snakemake offers the option --dry-run to list all rules that are going to be executed, see here. This will give you the tools and their input and output files, but you will have to somehow cobble together the actual command lines to execute. I don't think there is another way, as the construction of command lines and their subsequent execution are part of scripts that are just executed as a whole by snakemake.

However, I am not entirely sure that this is necessary. According to your above log, you can see which tools fail, can you not? So you'd only need to execute the failing ones by hand, I think.

Also, if I allocate another core, I just get Job needs threads=6 but only threads=4 are available.

As for that, yes, I see. I am developing on my 8 core (16 with hyperthreading) laptop, and the pipeline is mostly geared towards even larger systems such as clusters. Hence, I did not optimize it for smaller laptops. However, a change of the pipeline to work without warnings on 4 cores or fewer does not make much sense to me: Such a change would make it slower on larger systems and datasets. And for small datasets that can be run on your laptop, it does not matter much anyway - you should be able to just use --cores 6. This will of course oversubscribe your cores, and so your laptop will be slow while running, but it should work. For larger datasets where this is inconvenient, I would suggest to use a larger machine or cluster anyway.

Let me know if that helped or if you need any further input for now! Worst case, send me your data, and I can help debugging.

Cheers and a happy holiday season! Lucas

bensprung commented 2 years ago

Well I can't really tell tbh. It seems like samtools-stats is failing. I will look at --dry-run.

The strange thing is I can run with 1 core with the default tools with no errors. But it's only changing the caller to freebayes that creates this error--but it's very early in the pipeline. Seems weird?

lczech commented 2 years ago

That is indeed weird. The core issue should maximally lead to snakemake complaining or failing. But these errors seem to come from the tools being run, and not from snakemake itself... I did have issues in the past where one tool complained, but another was at fault, by producing erroneous or empty output files. You could check that the files that samtool stats wants to use (e.g., dedup/111D03-1.bam) are correct.

If that does not help - would you mind sharing your data or part of it with me?

bensprung commented 2 years ago

Hrm, I don't think it's a problem with the bam file, because it turns out (surprisingly) that I get the same error with --dry-run, but only if calling-tool is set to freebayes. If it is set to haplotypecaller or bcftools it completes without issue. I'll attach the output of --dry-run for all three.

Happy to send the data if you think it makes sense.

freebayes.txt haplotypecaller.txt bcftools.txt

lczech commented 2 years ago

Oh interesting, that error is indeed simply caused by too few cores. I thought snakemake would handle that differently, sorry for that. As said above, just run it with --cores 6 - that should work, but make your computer slow while the pipeline is running. As I would not recommend running the pipeline for any large dataset on a laptop anyway, that should not be a limitation, and hence suffice for testing ;-)

bensprung commented 2 years ago

Ok will try that. Any idea why it only happens with freebayes as the caller?

lczech commented 2 years ago

Yes, because the freebayes rules are implemented to use more cores by default, see the config file. You can change this setting as well (instead of changing --cores) to help with the issue.

bensprung commented 2 years ago

Got it. So, changing threads: 8 for freebays in the config file didn't yield a completed run (I tried ramping it down all the way to 1 but still continued to get various odd errors) but running snakemake with --cores 8 (which I didn't think I could do with only 4 physical cores) worked. Thank you!

lczech commented 2 years ago

Ah nice, glad to hear it worked out now! Closing the issue now, but feel free to re-open if needed.

Some things remain though:

I tried ramping it down all the way to 1 but still continued to get various odd errors

Hm, what exactly happened there?

which I didn't think I could do with only 4 physical cores

Ah yes, it's possible to over-subscribe your cores. As said, that will make your computer slow for a while, but absolutely okay to do, technically speaking.