sunbeam-labs / sunbeam

A robust, extensible metagenomics pipeline
http://sunbeam.readthedocs.io
166 stars 40 forks source link

Missing input files for rule preprocess_report #149

Closed kevinmcc21 closed 6 years ago

kevinmcc21 commented 6 years ago

I tried running sunbeam after setting up contaminant filtering and taxonomic classification. I did not set up contig annotation or reference mapping. I got the following error message:

(sunbeam) kevin@microb191:~/projects/micu[16:49] sunbeam run --configfile micu/sunbeam_config.yml Running: snakemake --snakefile /home/kevin/dev/sunbeam/Snakefile --configfile micu/sunbeam_config.yml Collecting host/contaminant genomes...

WARNING: No files detected in host genomes folder (/home/kevin/projects/micu/micu). If this is not intentional, make sure all files end in .fasta and the folder is specified correctly.

done. Collecting target genomes... done. MissingInputException in line 15 of /home/kevin/dev/sunbeam/rules/reports/reports.rules: Missing input files for rule preprocess_report: /home/kevin/projects/micu/micu/sunbeam_output/qc/log/trimmomatic/CS_007_S202_L002.out /home/kevin/projects/micu/micu/sunbeam_output/qc/log/trimmomatic/SS_023_S35_L003.out [error for each expected file]

My sunbeam_config.yml file (minus header):

all: root: "/home/kevin/projects/micu/micu" output_fp: "sunbeam_output" samplelist_fp: "samples.csv" paired_end: true version: "1.2.1+dev16.g7a55388"

Quality control

qc: suffix: qc

Trimmomatic

threads: 4 java_heapsize: 512M leading: 3 trailing: 3 slidingwindow: [4,15] minlen: 36 adapter_fp: "/home/kevin/anaconda3/envs/sunbeam/share/trimmomatic/adapters/NexteraPE-PE.fa"

Cutadapt

fwd_adapters: ['GTTTCCCAGTCACGATC', 'GTTTCCCAGTCACGATCNNNNNNNNNGTTTCCCAGTCACGATC'] rev_adapters: ['GTTTCCCAGTCACGATC', 'GTTTCCCAGTCACGATCNNNNNNNNNGTTTCCCAGTCACGATC']

Komplexity

kz_threshold: 0.55

Decontam.py

pct_id: 0.5 frac: 0.6 host_fp: "/data/internal/common/genomes/hg38/hg38.fa"

Taxonomic classifications

classify: suffix: classify threads: 4 kraken_db_fp: "/home/kevin/projects/micu/data/minikraken_20171101_8GB_dustmasked/"

Contig assembly

assembly: suffix: assembly min_length: 300 threads: 4

Contig annotation

annotation: suffix: annotation min_contig_len: 500 circular_kmin: 10 circular_kmax: 1000 circular_min_len: 3500

blast: threads: 4

blastdbs: root_fp: ""

mapping: suffix: mapping genomes_fp: "" samtools_opts: "" threads: 4

kevinmcc21 commented 6 years ago

Update: Louis pointed out that the host_fp parameter has two requirements: 1) It needs to be a directory, not the file itself. 2) The file needs to be .fasta (and not e.g. .fa, etc.) After making these changes, I no longer get the error. Huzzah!