Closed LaraFuhrmann closed 1 month ago
Tested VILOCA on the hiv-tutorial data CAP188 4
using the following config files to test uniform-tiling strategy and amplicon tiling strategy.
general:
virus_base_config: ""
aligner: bwa
snv_caller: viloca
input:
reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
# NOTE: this input datadir isn't standard
datadir: resources/samples/
read_length: 301
samples_file: samples.tsv
paired: true
viloca:
consensus: false
output:
snv: true
local: false
global: false
visualization: false
QA: false
general:
virus_base_config: ""
aligner: bwa
snv_caller: viloca
input:
reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
# NOTE: this input datadir isn't standard
datadir: resources/samples/
read_length: 301
samples_file: samples.tsv
paired: true
viloca:
consensus: false
insert_bedfile: resources/samples/insert.bed
output:
snv: true
local: false
global: false
visualization: false
QA: false
Now we are adding the smallgenomeutilities scritp to merge the paried-end reads into one read. This is since viloca can then be applied to the merged reads with longer local regions / windows.
An example config file could look like this:
general:
virus_base_config: ""
aligner: bwa
snv_caller: viloca
input:
reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
# NOTE: this input datadir isn't standard
datadir: resources/samples/
read_length: 301
samples_file: samples.tsv
paired: true
viloca:
merge_paired_end_reads: true
consensus: false
output:
snv: true
local: false
global: false
visualization: false
QA: false
We are also adding the paired_end_read_merger
from smallgenomeutilities to the workflow.
The workflow as tested with a ww SARS-CoV-2 sample using the following config file: (note we cheated by providing already an aligned bam file to V-pipe here, but the paired-end-read-merger and viloca were executed.)
general:
virus_base_config: ""
aligner: bwa
snv_caller: viloca
input:
reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
# NOTE: this input datadir isn't standard
datadir: resources/samples/
read_length: 301
samples_file: samples.tsv
paired: true
applications:
paired_end_read_merger: /Users/lfuhrmann/Downloads/smallgenomeutilities-dev/scripts/paired_end_read_merger
viloca:
merge_paired_end_reads: true
consensus: false
output:
snv: true
local: false
global: false
visualization: false
QA: false
There is a bug in the paired_end_read_merger
as VILOCA is reporting ValueError: Soft clipping only possible on the edges of a read.
For completeness, VILOCA also runs when we don't choose to the read merging.
general:
virus_base_config: ""
aligner: bwa
snv_caller: viloca
input:
reference: "{VPIPE_BASEDIR}/../resources/hiv/HXB2.fasta"
metainfo_file: "{VPIPE_BASEDIR}/../resources/hiv/metainfo.yaml"
gff_directory: "{VPIPE_BASEDIR}/../resources/hiv/gffs/"
# NOTE: this input datadir isn't standard
datadir: resources/samples/
read_length: 301
samples_file: samples.tsv
paired: true
applications:
paired_end_read_merger: /Users/lfuhrmann/Downloads/smallgenomeutilities-dev/scripts/paired_end_read_merger
viloca:
merge_paired_end_reads: false
consensus: false
output:
snv: true
local: false
global: false
visualization: false
QA: false
🦙 MegaLinter status: ✅ SUCCESS
See detailed report in MegaLinter reports
_MegaLinter is graciously provided by
_