Thank you and colleagues for the very nice svimmer and graphtyper software.
I would like to use svimmer and graphtyper for forced genotyping of the UNION of Manta ( many WGS) and SVIM-ASM (few assembly) discovered SVs in many WGS samples.
The versions that I am using are svimmer/20211209 and graphtyper/2.7.3
When I try to get the (merged) UNION of SVs via svimmer I get this error.
Traceback (most recent call last):
File "/tools/eb/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/tools/eb/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/multiprocessing/pool.py", line 51, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/tools/eb/software/svimmer/20211209-GCC-10.2.0/svimmer", line 82, in append_svs_from_vcf
svs.append(SV(record, check_type=not args.ignore_types, join_mode=args.join_mode, output_ids=args.ids))
File "/tools/eb/software/svimmer/20211209-GCC-10.2.0/sv.py", line 75, in __init__
assert False
AssertionError
I can use the svimmer argument --ignore-types to get svimmer to work.
But then graphtyper complains about Unknown SV type and I guess also drops the SVs of unknown type??
<warning> constructor.cpp:106 Unknown SV type DUP:TANDEM
<warning> constructor.cpp:106 Unknown SV type DUP:TANDEM
That does not make sense to me. INS is a novel sequence , DUP, CNV and INV are sequences already found on the reference genome and therefore also need to genotyped differently in graphtyper?
Also what I find strange is that both svimmer and graphtyper do output SVs of type DUP.
That I can't square with the mapping of DUP, CNV and INV to INS. Or maybe the SV type is re-calculated again somewhere else in svimmer/graphtyper?
Dear @hannespetur
Thank you and colleagues for the very nice svimmer and graphtyper software.
I would like to use svimmer and graphtyper for forced genotyping of the UNION of Manta ( many WGS) and SVIM-ASM (few assembly) discovered SVs in many WGS samples.
SVIM-ASM github https://github.com/eldariont/svim-asm
The versions that I am using are
svimmer/20211209
andgraphtyper/2.7.3
When I try to get the (merged) UNION of SVs via svimmer I get this error.
https://github.com/DecodeGenetics/svimmer/blob/f2d78b2f0e45100f507343a05cf7a65008b2ed9b/sv.py#L75
This is caused by svimmer not recognizing the
DUP:TANDEM
andDUP:INT
types that SVIM-ASM outputs. https://github.com/DecodeGenetics/svimmer/blob/f2d78b2f0e45100f507343a05cf7a65008b2ed9b/sv.py#L41I can use the svimmer argument
--ignore-types
to get svimmer to work. But then graphtyper complains aboutUnknown SV type
and I guess also drops the SVs of unknown type??Would it be possible to add a mapping for
DUP:TANDEM
andDUP:INT
in the main branch of the svimmer code here? https://github.com/DecodeGenetics/svimmer/blob/f2d78b2f0e45100f507343a05cf7a65008b2ed9b/sv.py#L41Then the the combination of SVIM-ASM and svimmer/graphtyper would work for me and others with the same use case/combination of tools.
I also don't understand why SVs of type
DUP
,CNV
andINV
are mapped to typeINS
here https://github.com/DecodeGenetics/svimmer/blob/f2d78b2f0e45100f507343a05cf7a65008b2ed9b/sv.py#L45That does not make sense to me.
INS
is a novel sequence ,DUP
,CNV
andINV
are sequences already found on the reference genome and therefore also need to genotyped differently in graphtyper?Also what I find strange is that both svimmer and graphtyper do output SVs of type DUP. That I can't square with the mapping of DUP, CNV and INV to INS. Or maybe the SV type is re-calculated again somewhere else in svimmer/graphtyper?
Thank you for your thoughts and help on this.