Open robertsap opened 2 months ago
Hi (and sorry for the slow answer, I was on holiday).
I notice that you're giving absolute paths in you configuration file (begining with a slash /
):
input: datadir: /samples/ # … output: datadir: /results/
And thus, V-pipe is trying to read and write file on the root directory of your workstation:
WorkflowError: MissingInputException: Missing input files for rule sam2bam: output: /results/22-L147/22-L147/alignments/REF_aln.bam, /results/22-L147/22-L147/alignments/REF_aln.bam.bai wildcards: file=/results/22-L147/22-L147/alignments/REF_aln affected files: /results/22-L147/22-L147/alignments/REF_aln.sam
see directories /results/22-L147/22-L147/…
above.
I presume you should be using paths relative to your current working directory, like the tutorials do, so without a leading /
, e.g.:
input: datadir: samples/ # ^- no '/' here # … output: datadir: results/ # ^- no '/' here
Another problem is that currently V-pipe doesn't provide any informations about Little Cherry Virus (See here for a list of available resources for viruses )
So this part is not going to work:
input: # … reference: "{VPIPE_BASEDIR}/resources/LChV-2/reference.fasta" genes_gff: "{VPIPE_BASEDIR}/../resources/LChV-2/genomic.gff"
You will need to provide your own. And change the configuration file accordingly. for example:
# create a resource directory in the current working directory:
mkdir -p resources/LChV-2/
# copy the files in there
cp …somewhere_where_you_have_the_files…/LChV-2/reference.fasta resources/LChV-2/
cp …somewhere_where_you_have_the_files…/LChV-2/genomic.gff resources/LChV-2/
and then edit the configuration file to point to this new resource directory you created:
input:
# …
reference: "resources/LChV-2/reference.fasta"
genes_gff: "resources/LChV-2/genomic.gff"
# ^- no leading '/': search in the current working directory.
(Of course you could also install the files into your local copy of V-pipe, in which case you would have to fix a missing ..
as the {VPIPE_BASEDIR}
refers to the V-pipe/workflow/
directory, due to a limitation of how Snakemake works).
input:
# …
# '..' missing here --------vv
reference: "{VPIPE_BASEDIR}/../resources/LChV-2/reference.fasta"
genes_gff: "{VPIPE_BASEDIR}/../resources/LChV-2/genomic.gff"
(NOTE: if you decide to modify V-pipe to add support for LChV-2, we would be interested in your pull request)
Thanks so much for your response! I hope you had a pleasant holiday :)
I made the necessary modifications to the directory paths in my config file, however I am still getting the same error message
config file: ` general: virus_base_config: ""
input: datadir: samples/ samples_file: samples.tsv reference: "{VPIPE_BASEDIR}/../resources/LChV-2/reference.fasta" genes_gff: "{VPIPE_BASEDIR}/../resources/LChV-2/genomic.gff" read_length: 150
output: datadir: results/ trim_primers: false snv: true local: true global: true visualization: true diversity: true QA: true upload: false dehumanized_raw_reads: false `
error message:
WorkflowError: MissingInputException: Missing input files for rule sam2bam: output: results/22-L147/22-L147/alignments/REF_aln.bam, results/22-L147/22-L147/alignments/REF_aln.bam.bai wildcards: file=results/22-L147/22-L147/alignments/REF_aln affected files: results/22-L147/22-L147/alignments/REF_aln.sam WorkflowError: WorkflowError: MissingInputException: Missing input files for rule gunzip: output: results/22-L147/22-L147/extracted_data/R1.fastq wildcards: file=results/22-L147/22-L147/extracted_data/R1, ext=fastq affected files: results/22-L147/22-L147/extracted_data/R1.fastq.gz MissingInputException: Missing input files for rule gunzip: output: results/22-L147/22-L147/extracted_data/R1.fastq wildcards: file=results/22-L147/22-L147/extracted_data/R1, ext=fastq affected files: results/22-L147/22-L147/extracted_data/R1.fastq.gz CyclicGraphException: Cyclic dependency on rule convert_to_ref.
As for the reference, gff file locations etc. I did have them in the 'V-pipe/workflow' directory, so the pathing should have worked. However I did as you recommended and moved them into the 'resources' directory, and changed my config file to reflect the pathing (see above).
Thanks in advance for your patience. I'm a novice on the command line, so there may be something basic that I'm missing.
Describe the bug I am analyzing data from plant (cherry) samples hoping to determine viral quasispecies of Little Cherry Virus I set up my v-pipe workflow based on the sars-cov2 tutorial However, when I attempt to run v-pipe, either through a dry run or fully, I get a "workflow error"
My questions: I'm curious why there is a "missing input file" (for sam2bam and gunzip). I was not instructed to give any other files except the fastq. Is the workflow error a bug, or something I am missing in my input/config files?
To Reproduce
input: datadir: /samples/ samples_file: samples.tsv reference: "{VPIPE_BASEDIR}/resources/LChV-2/reference.fasta" genes_gff: "{VPIPE_BASEDIR}/../resources/LChV-2/genomic.gff" read_length: 150
output: datadir: /results/ trim_primers: false snv: true local: true global: true visualization: true diversity: true QA: true upload: false dehumanized_raw_reads: false
samples ├── 22-L147 │ └── 230309 │ └── raw_data │ ├── 22-L147_S3_R1.fastq │ └── 22-L147_S3_R2.fastq └── 22-L801 └── 230309 └── raw_data ├── 22-L801_S14_R1.fastq └── 22-L801_S14_R2.fastq
6 directories, 4 files
vi samples.tsv 22-L147 22-L147 22-L801 22-L801
./vpipe --dryrun
Building DAG of jobs... WorkflowError: MissingInputException: Missing input files for rule sam2bam: output: /results/22-L147/22-L147/alignments/REF_aln.bam, /results/22-L147/22-L147/alignments/REF_aln.bam.bai wildcards: file=/results/22-L147/22-L147/alignments/REF_aln affected files: /results/22-L147/22-L147/alignments/REF_aln.sam WorkflowError: WorkflowError: MissingInputException: Missing input files for rule gunzip: output: /results/22-L147/22-L147/extracted_data/R1.fastq wildcards: file=/results/22-L147/22-L147/extracted_data/R1, ext=fastq affected files: /results/22-L147/22-L147/extracted_data/R1.fastq.gz MissingInputException: Missing input files for rule gunzip: output: /results/22-L147/22-L147/extracted_data/R1.fastq wildcards: file=/results/22-L147/22-L147/extracted_data/R1, ext=fastq affected files: /results/22-L147/22-L147/extracted_data/R1.fastq.gz CyclicGraphException: Cyclic dependency on rule convert_to_ref.