snakemake / snakemake-storage-plugin-http

Snakemake storage plugin for donwloading input files from HTTP(s).
MIT License
0 stars 2 forks source link

DAG creation extremely slow with storage function targeting zip files #25

Open FabianHofmann opened 3 months ago

FabianHofmann commented 3 months ago

The storage function can lead to very long DAG creation times when it is pointing to online zip files.

The following example shows it quite clearly.

Snakefile:

rule retrieve_eurostat_data:
    input:
        storage(
            "https://ec.europa.eu/eurostat/documents/38154/4956218/Balances-April2023.zip", 
        ),

When running snakemake -n, the DAG creation takes longer than two minutes (direct download time via browser ~20 seconds)

I don't know whether it is related to the fact, that snakemake runs the download multiple times even though it is in dry-run mode?

Let me know if there is a way to support or if you need more information/context.