KatharinaHoff / braker-snake

Simple snakemake workflows for handling BRAKER on large data sets
MIT License
0 stars 1 forks source link

fastq-dump timeouts #1

Closed KatharinaHoff closed 6 months ago

KatharinaHoff commented 6 months ago

The rule file https://github.com/KatharinaHoff/braker-snake/blob/main/rules/rnaseq_download.smk contains a rule download_fastq, which is prone to repeated failures. These are caused by connection timeouts.

For now, I impemented it in a way that it picks up where it failed, so the expensive stuff is not repeated, but the problem can possibly be avoided by first getting a prefetch file (see https://twitter.com/lh3lh3/status/1779876367200387172). Also, fasterq-dump could be used to speed things up.

@claraptzsl you may want to look into this, otherwise, you have to restart the pipeline many times on a larger dataset.

KatharinaHoff commented 6 months ago

It was just too annoying with the timeouts, prefetch is now added. https://github.com/KatharinaHoff/braker-snake/commit/716256ae878d01c8ff7c74ac940e2498270b68e0