I am trying to call variants using the pggb singularity pipeline starting from multiple haplotype-resolved genomes. I used PanSN-spec naming and it all worked well until I had to chose the reference for graph decomposition.
I would like to deconstruct the graph using ind1#1 as a reference, since it's the most complete genome and I used it for linear short-read mapping.
However, when choosing a reference in pggb singularity (pggb 87510bc):
if I specify -V 'ind1#', -V 'ind1#1' or -V 'ind1#1#' the prefix is not recognized
if I specify -V 'ind1', vg deconstruct works but calls the variants against both haplotypes of ind1and ind10 because they both start with 1
The command used in the pggb runs is:
vg deconstruct -P ind1#2# -H # -e -a -t 32 chr1.smooth.final.gfa
and this is an example error:
[vg::deconstruct] making VCF with reference=ind1#1# and delim=# xxxxxxxxxxxxx ind1#1# ------------ 0
Error [vg deconstruct]: No specified reference path or prefix found in graph
I can successfully run vg deconstruct using ind1#1 as a ref if I do it after pggb graph construction, removing -H #:
vg deconstruct -P ind1#1 -e -a -t 32 ch1.smooth.final.gfa.
Do you have any recommendation on how to specify the haplotype/sample I want to call the variants against to directly in pggb?
Hi all,
I am trying to call variants using the pggb singularity pipeline starting from multiple haplotype-resolved genomes. I used PanSN-spec naming and it all worked well until I had to chose the reference for graph decomposition.
My path names are similar to these:
I would like to deconstruct the graph using
ind1#1
as a reference, since it's the most complete genome and I used it for linear short-read mapping.However, when choosing a reference in
pggb
singularity (pggb 87510bc):-V 'ind1#'
,-V 'ind1#1'
or-V 'ind1#1#'
the prefix is not recognized-V 'ind1'
, vg deconstruct works but calls the variants against both haplotypes ofind1
andind10
because they both start with 1The command used in the pggb runs is:
vg deconstruct -P ind1#2# -H # -e -a -t 32 chr1.smooth.final.gfa
and this is an example error:I can successfully run
vg deconstruct
usingind1#1
as a ref if I do it after pggb graph construction, removing-H #
:vg deconstruct -P ind1#1 -e -a -t 32 ch1.smooth.final.gfa
.Do you have any recommendation on how to specify the haplotype/sample I want to call the variants against to directly in
pggb
?Many thanks!
Simona