cbg-ethz / V-pipe

V-pipe is a pipeline designed for analysing NGS data of short viral genomes
https://cbg-ethz.github.io/V-pipe/
Apache License 2.0
129 stars 43 forks source link

Snakemake dry run 404 #147

Closed helmotw closed 9 months ago

helmotw commented 1 year ago

Hallo, When i use snakemake --use-conda --jobs 4 --printshellcmds --dry-run, it tries to access a dead link:

https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh

DrYak commented 1 year ago

Hello,

something is indeed wrong (the scripts are directly in the workflow/ directory, not the workflow/rules/ sub-directory). In order to understand how this problem happened:

Out of the top of my head: this might be an error using an outdated version of V-pipe. Either upgrade to the latest master branch, or use one of the relaeses. Of course we'll see better once you provide the above information.

helmotw commented 1 year ago

Hi again and thx for the reply, i wanted to use it on a HPC Slurm envinronment so i used the mamba installation to have an isolated conda environment. Heres the Output:

VPIPE_BASEDIR = https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow Using base configuration virus HIV Caching https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/../resources/hiv/HXB2.fasta into /home/helmo/.cache/snakemake/snakemake/source-cache/runtime-cache/tmpl9gb7bcm/https/raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/../resources/hiv/HXB2.fasta

Failed to open source file https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh, attempt 1/3 failed - retrying in 3 seconds... Failed to open source file https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh, attempt 2/3 failed - retrying in 6 seconds... Failed to open source file https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh, attempt 3/3 failed - giving up! WorkflowError in file https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/common.smk, line 82: Failed to open source file https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh HTTPError: 404 Client Error: Not Found for url: https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/scripts/functions.sh

File "/vpipe/workflow/Snakefile", line 19, in File "https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/Snakefile", line 15, in File "https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/rules/common.smk", line 82, in cachepath File "/home/helmo/.conda/envs/snakemake/lib/python3.11/site-packages/reretry/api.py", line 218, in retry_call File "/home/helmo/.conda/envs/snakemake/lib/python3.11/site-packages/reretry/api.py", line 31, in __retry_internal

DrYak commented 1 year ago

Looking at the output, it seems that you're trying to run V-pipe directly off the github repo (Sadly this configuration isn't tested yet as part of our CI/CD).

What was the commands you used to up to this point? (e.g.: did you use snakedeploy to run V-pipe? did you run "snakemake -s https://raw.githubusercontent.com/cbg-ethz/V-pipe/master/workflow/Snakefile"? other commands?), also could you show me the steps you used to install everything?

I would like to reproduce your setup to understand why in your case (running it from github) causes this line to search …/workflow/rules/scripts/functions.sh instead of …/workflow/scripts/functions.sh as it should.

cecivale commented 11 months ago

Hi! I get the same error. I used snakemake and snakedeploy to deploy the workflow (versions snakemake 7.30.1 and snakedeploy 0.8.6 ). I run:

snakedeploy deploy-workflow https://github.com/cbg-ethz/V-pipe --tag master .
# edit config/config.yaml and provide samples/ directory
snakemake --use-conda --jobs 4 --printshellcmds --dry-run

Thanks!

vera-rykalina commented 11 months ago

Hi there! The same issue at my end. I believe, people do not use your recommended quick install script as it is meant to install miniconda what is a crazy thing to do if you already have it on your system. For instance, we have miniconda on HPC and cannot just reinstall it there. It would be great if you could provide the users with an installation option without reinstalling miniconda and .yml instructions for dependencies.

DrYak commented 11 months ago

Hi! I get the same error. I used snakemake and snakedeploy to deploy the workflow

Thanks that will help reproduce it fix this. Currently snakedeploy isn't automatically tested as part of our CI/CD so such mistakes slip through. I'll fix the situation, next.

I believe, people do not use your recommended quick install script as it is meant to install miniconda what is a crazy thing to do if you already have it on your system.

Yup, this was mostly targetted to users who don't have experience with conda/mamba, nor snakemake. Of course as users with those skills are now in larger proportion we would need to adapt.

It would be great if you could provide the users with an installation option without reinstalling miniconda and .yml instructions for dependencies.

A temporary workaround until we fix snakedeploy would be to clone the git of V-pipe locally and run it from there. e.g.:

# make sure you have an environment with snakemake installed
mamba create -c conda-forge -c bioconda --name snakemake snakemake
conda activate snakemake

# fetch V-pipe from git
git clone --depth 1 --branch master https://github.com/cbg-ethz/V-pipe.git

# make a working directory
mkdir working/
cd working/
# …and populate config/config.yaml, config/samples.tsv, the samples/ input directory…

# run V-pipe 
snakemake -s "../V-pipe/workflow/Snakefile" --use-conda --jobs 4 --printshellcmds --dry-run

Alternatively, make a script that simplifies calling V-pipe, the way the installer and the tutorials use:

# if you take care of activating conda and the correct environment yourself:
../V-pipe/init_project.sh -n 

# …and populate ./config.yaml, ./samples.tsv, the samples/ input directory…
# NOTE: init_project.sh puts the example configuration inside the working directory, instead of a config subdirectory. It's up to you to move them into a config/ directory in the style of snakedeploy

# run V-pipe 
./vpipe  --use-conda --jobs 4 --printshellcmds --dry-run

The "rubicon" branch has also a feature where you manually specify your paths and then the scriptlet takes care of activating:

../V-pipe/init_project.sh -c /path/of/conda/ -e snakemake

# …and populate ./config.yaml, ./samples.tsv, the samples/ input directory…

# run V-pipe 
./vpipe --jobs 4 --printshellcmds --dry-run
DrYak commented 11 months ago

Update: I have found a fix for this 404, and made proper CI tests for snake deploy.

BUT a bug in the bioconda package of snakemake-minimal has been discovered and I need that fixed before I can fully test the fix and merge into other branches.

DrYak commented 11 months ago

Sorry for the long delay: multiple upstream troubles (snakemake bioconda package needed dependencies adjustments; anaconda's CDN troubles prevented the new package builds to appear in bioconda). Now everything is back to normal with snakemake and I managed to run through all the tests.

The branch rubicon (our staging branch) now contains the necessary fix to remove these 404 errors and is fully tested.

To try it, use:

snakedeploy deploy-workflow https://github.com/cbg-ethz/V-pipe --branch rubicon .
# edit config/config.yaml and provide samples/ directory
snakemake --use-conda --jobs 4 --printshellcmds --dry-run

(i.e.: use branch rubicon in lieu of master).

vera-rykalina commented 11 months ago

Thanks a lot for your hard-working.