Closed sarahdada closed 3 years ago
@sarahdada could you include the vcf header and variant line in question? Which file did the error occur on (should have a log message for which one it was converting above the error)
To get the relevant lines you can use grep
grep 'chr19:27302623' diploidSV.de_novo.vcf
and the header
grep '^##' diploidSV.de_novo.vcf
Make sure to remove any identifiers before posting (library names, patient names, etc)
For grep chr19, it doesn't seem to come up with anything- so maybe the SV caller format is off
##fileformat=VCFv4.1
##fileDate=20210415
##source=GenerateSVCandidates 1.4.0
##reference=file:///projects/file/hg38_no_alt.fa
##contig=<ID=chr1,length=248956422>
##contig=<ID=chr2,length=242193529>
##contig=<ID=chr3,length=198295559>
##contig=<ID=chr4,length=190214555>
##contig=<ID=chr5,length=181538259>
##contig=<ID=chr6,length=170805979>
##contig=<ID=chr7,length=159345973>
##contig=<ID=chr8,length=145138636>
##contig=<ID=chr9,length=138394717>
##contig=<ID=chr10,length=133797422>
##contig=<ID=chr11,length=135086622>
##contig=<ID=chr12,length=133275309>
##contig=<ID=chr13,length=114364328>
##contig=<ID=chr14,length=107043718>
##contig=<ID=chr15,length=101991189>
##contig=<ID=chr16,length=90338345>
##contig=<ID=chr17,length=83257441>
##contig=<ID=chr18,length=80373285>
##contig=<ID=chr19,length=58617616>
##contig=<ID=chr20,length=64444167>
##contig=<ID=chr21,length=46709983>
##contig=<ID=chr22,length=50818468>
##contig=<ID=chrX,length=156040895>
##contig=<ID=chrY,length=57227415>
##contig=<ID=chrM,length=16569>
##contig=<ID=chr1_KI270706v1_random,length=175055>
##contig=<ID=chr1_KI270707v1_random,length=32032>
##contig=<ID=chr1_KI270708v1_random,length=127682>
##contig=<ID=chr1_KI270709v1_random,length=66860>
##contig=<ID=chr1_KI270710v1_random,length=40176>
##contig=<ID=chr1_KI270711v1_random,length=42210>
##contig=<ID=chr1_KI270712v1_random,length=176043>
##contig=<ID=chr1_KI270713v1_random,length=40745>
##contig=<ID=chr1_KI270714v1_random,length=41717>
##contig=<ID=chr2_KI270715v1_random,length=161471>
##contig=<ID=chr2_KI270716v1_random,length=153799>
##contig=<ID=chr3_GL000221v1_random,length=155397>
##contig=<ID=chr4_GL000008v2_random,length=209709>
##contig=<ID=chr5_GL000208v1_random,length=92689>
##contig=<ID=chr9_KI270717v1_random,length=40062>
##contig=<ID=chr9_KI270718v1_random,length=38054>
##contig=<ID=chr9_KI270719v1_random,length=176845>
##contig=<ID=chr9_KI270720v1_random,length=39050>
##contig=<ID=chr11_KI270721v1_random,length=100316>
##contig=<ID=chr14_GL000009v2_random,length=201709>
##contig=<ID=chr14_GL000225v1_random,length=211173>
##contig=<ID=chr14_KI270722v1_random,length=194050>
##contig=<ID=chr14_GL000194v1_random,length=191469>
##contig=<ID=chr14_KI270723v1_random,length=38115>
##contig=<ID=chr14_KI270724v1_random,length=39555>
##contig=<ID=chr14_KI270725v1_random,length=172810>
##contig=<ID=chr14_KI270726v1_random,length=43739>
##contig=<ID=chr15_KI270727v1_random,length=448248>
##contig=<ID=chr16_KI270728v1_random,length=1872759>
##contig=<ID=chr17_GL000205v2_random,length=185591>
##contig=<ID=chr17_KI270729v1_random,length=280839>
##contig=<ID=chr17_KI270730v1_random,length=112551>
##contig=<ID=chr22_KI270731v1_random,length=150754>
##contig=<ID=chr22_KI270732v1_random,length=41543>
##contig=<ID=chr22_KI270733v1_random,length=179772>
##contig=<ID=chr22_KI270734v1_random,length=165050>
##contig=<ID=chr22_KI270735v1_random,length=42811>
##contig=<ID=chr22_KI270736v1_random,length=181920>
##contig=<ID=chr22_KI270737v1_random,length=103838>
##contig=<ID=chr22_KI270738v1_random,length=99375>
##contig=<ID=chr22_KI270739v1_random,length=73985>
##contig=<ID=chrY_KI270740v1_random,length=37240>
##contig=<ID=chrUn_KI270302v1,length=2274>
##contig=<ID=chrUn_KI270304v1,length=2165>
##contig=<ID=chrUn_KI270303v1,length=1942>
##contig=<ID=chrUn_KI270305v1,length=1472>
##contig=<ID=chrUn_KI270322v1,length=21476>
##contig=<ID=chrUn_KI270320v1,length=4416>
##contig=<ID=chrUn_KI270310v1,length=1201>
##contig=<ID=chrUn_KI270316v1,length=1444>
##contig=<ID=chrUn_KI270315v1,length=2276>
##contig=<ID=chrUn_KI270312v1,length=998>
##contig=<ID=chrUn_KI270311v1,length=12399>
##contig=<ID=chrUn_KI270317v1,length=37690>
##contig=<ID=chrUn_KI270412v1,length=1179>
##contig=<ID=chrUn_KI270411v1,length=2646>
##contig=<ID=chrUn_KI270414v1,length=2489>
##contig=<ID=chrUn_KI270419v1,length=1029>
##contig=<ID=chrUn_KI270418v1,length=2145>
##contig=<ID=chrUn_KI270420v1,length=2321>
##contig=<ID=chrUn_KI270424v1,length=2140>
##contig=<ID=chrUn_KI270417v1,length=2043>
##contig=<ID=chrUn_KI270422v1,length=1445>
##contig=<ID=chrUn_KI270423v1,length=981>
##contig=<ID=chrUn_KI270425v1,length=1884>
##contig=<ID=chrUn_KI270429v1,length=1361>
##contig=<ID=chrUn_KI270442v1,length=392061>
##contig=<ID=chrUn_KI270466v1,length=1233>
##contig=<ID=chrUn_KI270465v1,length=1774>
##contig=<ID=chrUn_KI270467v1,length=3920>
##contig=<ID=chrUn_KI270435v1,length=92983>
##contig=<ID=chrUn_KI270438v1,length=112505>
##contig=<ID=chrUn_KI270468v1,length=4055>
##contig=<ID=chrUn_KI270510v1,length=2415>
##contig=<ID=chrUn_KI270509v1,length=2318>
##contig=<ID=chrUn_KI270518v1,length=2186>
##contig=<ID=chrUn_KI270508v1,length=1951>
##contig=<ID=chrUn_KI270516v1,length=1300>
##contig=<ID=chrUn_KI270512v1,length=22689>
##contig=<ID=chrUn_KI270519v1,length=138126>
##contig=<ID=chrUn_KI270522v1,length=5674>
##contig=<ID=chrUn_KI270511v1,length=8127>
##contig=<ID=chrUn_KI270515v1,length=6361>
##contig=<ID=chrUn_KI270507v1,length=5353>
##contig=<ID=chrUn_KI270517v1,length=3253>
##contig=<ID=chrUn_KI270529v1,length=1899>
##contig=<ID=chrUn_KI270528v1,length=2983>
##contig=<ID=chrUn_KI270530v1,length=2168>
##contig=<ID=chrUn_KI270539v1,length=993>
##contig=<ID=chrUn_KI270538v1,length=91309>
##contig=<ID=chrUn_KI270544v1,length=1202>
##contig=<ID=chrUn_KI270548v1,length=1599>
##contig=<ID=chrUn_KI270583v1,length=1400>
##contig=<ID=chrUn_KI270587v1,length=2969>
##contig=<ID=chrUn_KI270580v1,length=1553>
##contig=<ID=chrUn_KI270581v1,length=7046>
##contig=<ID=chrUn_KI270579v1,length=31033>
##contig=<ID=chrUn_KI270589v1,length=44474>
##contig=<ID=chrUn_KI270590v1,length=4685>
##contig=<ID=chrUn_KI270584v1,length=4513>
##contig=<ID=chrUn_KI270582v1,length=6504>
##contig=<ID=chrUn_KI270588v1,length=6158>
##contig=<ID=chrUn_KI270593v1,length=3041>
##contig=<ID=chrUn_KI270591v1,length=5796>
##contig=<ID=chrUn_KI270330v1,length=1652>
##contig=<ID=chrUn_KI270329v1,length=1040>
##contig=<ID=chrUn_KI270334v1,length=1368>
##contig=<ID=chrUn_KI270333v1,length=2699>
##contig=<ID=chrUn_KI270335v1,length=1048>
##contig=<ID=chrUn_KI270338v1,length=1428>
##contig=<ID=chrUn_KI270340v1,length=1428>
##contig=<ID=chrUn_KI270336v1,length=1026>
##contig=<ID=chrUn_KI270337v1,length=1121>
##contig=<ID=chrUn_KI270363v1,length=1803>
##contig=<ID=chrUn_KI270364v1,length=2855>
##contig=<ID=chrUn_KI270362v1,length=3530>
##contig=<ID=chrUn_KI270366v1,length=8320>
##contig=<ID=chrUn_KI270378v1,length=1048>
##contig=<ID=chrUn_KI270379v1,length=1045>
##contig=<ID=chrUn_KI270389v1,length=1298>
##contig=<ID=chrUn_KI270390v1,length=2387>
##contig=<ID=chrUn_KI270387v1,length=1537>
##contig=<ID=chrUn_KI270395v1,length=1143>
##contig=<ID=chrUn_KI270396v1,length=1880>
##contig=<ID=chrUn_KI270388v1,length=1216>
##contig=<ID=chrUn_KI270394v1,length=970>
##contig=<ID=chrUn_KI270386v1,length=1788>
##contig=<ID=chrUn_KI270391v1,length=1484>
##contig=<ID=chrUn_KI270383v1,length=1750>
##contig=<ID=chrUn_KI270393v1,length=1308>
##contig=<ID=chrUn_KI270384v1,length=1658>
##contig=<ID=chrUn_KI270392v1,length=971>
##contig=<ID=chrUn_KI270381v1,length=1930>
##contig=<ID=chrUn_KI270385v1,length=990>
##contig=<ID=chrUn_KI270382v1,length=4215>
##contig=<ID=chrUn_KI270376v1,length=1136>
##contig=<ID=chrUn_KI270374v1,length=2656>
##contig=<ID=chrUn_KI270372v1,length=1650>
##contig=<ID=chrUn_KI270373v1,length=1451>
##contig=<ID=chrUn_KI270375v1,length=2378>
##contig=<ID=chrUn_KI270371v1,length=2805>
##contig=<ID=chrUn_KI270448v1,length=7992>
##contig=<ID=chrUn_KI270521v1,length=7642>
##contig=<ID=chrUn_GL000195v1,length=182896>
##contig=<ID=chrUn_GL000219v1,length=179198>
##contig=<ID=chrUn_GL000220v1,length=161802>
##contig=<ID=chrUn_GL000224v1,length=179693>
##contig=<ID=chrUn_KI270741v1,length=157432>
##contig=<ID=chrUn_GL000226v1,length=15008>
##contig=<ID=chrUn_GL000213v1,length=164239>
##contig=<ID=chrUn_KI270743v1,length=210658>
##contig=<ID=chrUn_KI270744v1,length=168472>
##contig=<ID=chrUn_KI270745v1,length=41891>
##contig=<ID=chrUn_KI270746v1,length=66486>
##contig=<ID=chrUn_KI270747v1,length=198735>
##contig=<ID=chrUn_KI270748v1,length=93321>
##contig=<ID=chrUn_KI270749v1,length=158759>
##contig=<ID=chrUn_KI270750v1,length=148850>
##contig=<ID=chrUn_KI270751v1,length=150742>
##contig=<ID=chrUn_KI270752v1,length=27745>
##contig=<ID=chrUn_KI270753v1,length=62944>
##contig=<ID=chrUn_KI270754v1,length=40191>
##contig=<ID=chrUn_KI270755v1,length=36723>
##contig=<ID=chrUn_KI270756v1,length=79590>
##contig=<ID=chrUn_KI270757v1,length=71251>
##contig=<ID=chrUn_GL000214v1,length=137718>
##contig=<ID=chrUn_KI270742v1,length=186739>
##contig=<ID=chrUn_GL000216v2,length=176608>
##contig=<ID=chrUn_GL000218v1,length=161147>
##contig=<ID=chrEBV,length=171823>
##INFO=<ID=IMPRECISE,Number=0,Type=Flag,Description="Imprecise structural variation">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=SVLEN,Number=.,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=CIPOS,Number=2,Type=Integer,Description="Confidence interval around POS">
##INFO=<ID=CIEND,Number=2,Type=Integer,Description="Confidence interval around END">
##INFO=<ID=CIGAR,Number=A,Type=String,Description="CIGAR alignment for each alternate indel allele">
##INFO=<ID=MATEID,Number=.,Type=String,Description="ID of mate breakend">
##INFO=<ID=EVENT,Number=1,Type=String,Description="ID of event associated to breakend">
##INFO=<ID=HOMLEN,Number=.,Type=Integer,Description="Length of base pair identical homology at event breakpoints">
##INFO=<ID=HOMSEQ,Number=.,Type=String,Description="Sequence of base pair identical homology at event breakpoints">
##INFO=<ID=SVINSLEN,Number=.,Type=Integer,Description="Length of insertion">
##INFO=<ID=SVINSSEQ,Number=.,Type=String,Description="Sequence of insertion">
##INFO=<ID=LEFT_SVINSSEQ,Number=.,Type=String,Description="Known left side of insertion for an insertion of unknown length">
##INFO=<ID=RIGHT_SVINSSEQ,Number=.,Type=String,Description="Known right side of insertion for an insertion of unknown length">
##INFO=<ID=INV3,Number=0,Type=Flag,Description="Inversion breakends open 3' of reported location">
##INFO=<ID=INV5,Number=0,Type=Flag,Description="Inversion breakends open 5' of reported location">
##INFO=<ID=BND_DEPTH,Number=1,Type=Integer,Description="Read depth at local translocation breakend">
##INFO=<ID=MATE_BND_DEPTH,Number=1,Type=Integer,Description="Read depth at remote translocation mate breakend">
##INFO=<ID=JUNCTION_QUAL,Number=1,Type=Integer,Description="If the SV junction is part of an EVENT (ie. a multi-adjacency variant), this field provides the QUAL value for the adjacency in question only">
##FORMAT=<ID=DQ,Number=1,Type=Integer,Description="De novo quality score">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=FT,Number=1,Type=String,Description="Sample filter, 'PASS' indicates that all filters have passed for this sample">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
##FORMAT=<ID=PR,Number=.,Type=Integer,Description="Spanning paired-read support for the ref and alt alleles in the order listed">
##FORMAT=<ID=SR,Number=.,Type=Integer,Description="Split reads for the ref and alt alleles in the order listed, for reads where P(allele|read)>0.999">
##FILTER=<ID=Ploidy,Description="For DEL & DUP variants, the genotypes of overlapping variants (with similar size) are inconsistent with diploid expectation">
##FILTER=<ID=MaxDepth,Description="Depth is greater than 3x the median chromosome depth near one or both variant breakends">
##FILTER=<ID=MaxMQ0Frac,Description="For a small variant (<1000 bases), the fraction of reads in all samples with MAPQ0 around either breakend exceeds 0.4">
##FILTER=<ID=NoPairSupport,Description="For variants significantly larger than the paired read fragment size, no paired reads support the alternate allele in any sample.">
##FILTER=<ID=MinQUAL,Description="QUAL score is less than 20">
##FILTER=<ID=MinGQ,Description="GQ score is less than 15 (filter applied at sample level and record level if all samples are filtered)">
##ALT=<ID=INV,Description="Inversion">
##ALT=<ID=DEL,Description="Deletion">
##ALT=<ID=INS,Description="Insertion">
##ALT=<ID=DUP:TANDEM,Description="Tandem Duplication">
##cmdline=/gsc/software/linux-x86_64-centos6/manta-1.4.0/bin/configManta.py --bam /projects/bam /projects/path//file1.bam --bam /projects/path/file2.bam --bam /projects/bam /projects/path/file3.bam --referenceFasta /projects/path/hg38_no_alt.fa --runDir /projects/path/MantaResults_withHeader/ ```
The error comes when:
(base) [sdada@gphost14 MAVIS]$ mavis setup file.cfg -o /path/MAVIS MAVIS: 2.2.9 hostname: gphost14.bcgsc.ca [2021-04-28 14:39:13] arguments command = 'setup' config = '/projects/path/file.cfg' log = None log_level = 'INFO' output = '/projects/path/MAVIS' skip_stage = [] creating output directory: '/projects/path/MAVIS/converted_inputs' setting up the directory structure for PG0003838 as /projects/path/MAVIS/PG0003_normal_genome converting input command: ['convert_tool_output', '/projects/filepath/file1.vcf', 'delly', False] reading: /projects/file1.vcf
Header for DELLY
(base) [sdada@gphost14 hg38_PG0003514_3838_4859_fastq_SV]$ grep '^#' file.vcf
DELLY vcf grep chr19 does get a result. I can't scroll up on error but can just quickly rerun
chrUn_KI270516v1 2 BND00073768 N ]chr19:27302623] 67 LowQual PRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.7;END=3;CHR2=chr19;POS2=27302623;PE=0;MAPQ=0;CT=5to3;CIPOS=-1,1;CIEND=-1,1;SRMAPQ=14;INSLEN=0;HOMLEN=0;SR=4;SRQ=0.951613;CONSENSUS=CAATTTGGAGAGTTTTGAGGCCTATTGTGGAAAGATATATCCTAAAATAAAAAATACACGGAAGCATTCTGAGAAACTTCATTGTTTTGTGTGCATTCAACTCACAGAGTTGAACCTATCT;CE=1.9285GT:GL:GQ:FT:RCL:RC:RCR:RDCN:DR:DV:RR:RV 0/1:-129.223,0,-44.7728:10000:PASS:0:110747:110747:2:0:0:19:59 0/1:-107.543,0,-42.0154:10000:PASS:0:95614:95614:2:0:0:16:41
(base) [sdada@gphost14 DellyResults_withHeader]$ grep 'chr19:27302623' dellyfile2.vcf
chrUn_KI270516v1 2 BND00069701 N ]chr19:27302623] 67 LowQual PRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.7;END=3;CHR2=chr19;POS2=27302623;PE=0;MAPQ=0;CT=5to3;CIPOS=-1,1;CIEND=-1,1;SRMAPQ=14;INSLEN=0;HOMLEN=0;SR=4;SRQ=0.951613;CONSENSUS=CAATTTGGAGAGTTTTGAGGCCTATTGTGGAAAGATATATCCTAAAATAAAAAATACACGGAAGCATTCTGAGAAACTTCATTGTTTTGTGTGCATTCAACTCACAGAGTTGAACCTATCT;CE=1.9285GT:GL:GQ:FT:RCL:RC:RCR:RDCN:DR:DV:RR:RV 0/1:-129.223,0,-44.7728:10000:PASS:0:110747:110747:2:0:0:19:59 0/1:-94.3442,0,-45.4931:10000:PASS:0:88163:88163:2:0:0:17:39 ```
ok so I actually might know what's going on here. It looks like the DELLY is using an alt format that is close but doesn't match the vcf 4.2 specification (despite the header)
r = reference base/seq
u = untemplated sequence/alternate sequence
p = chromosome:position
They have ]p]
but we expect that to include sequences ]p]ur
The alternate sequence can be empty but the reference sequence we don't consider optional. @calchoo can you double check the VCF 4.2 format for this make sure I'm interpreting it correctly
What version of delly are you using?
/gsc/software/linux-x86_64-centos7/delly-0.8.7/bin/delly is what i am using thanks cara!!
I used delly-0.8.1 , am clearly making an error here
MAVIS: 2.2.9
hostname: gphost14.bcgsc.ca
[2021-04-29 16:30:22] arguments
command = 'setup'
config = '/projects/pathMAVIS/file_both.cfg'
log = None
log_level = 'INFO'
output = '/projects/path/MAVIS'
skip_stage = []
Traceback (most recent call last):
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/util.py", line 65, in filepath
file_list = bash_expands(path)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/util.py", line 158, in bash_expands
raise FileNotFoundError('The expression does not match any files', expression)
FileNotFoundError: [Errno The expression does not match any files] None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sdada/miniconda3/bin/mavis", line 8, in <module>
sys.exit(main())
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/main.py", line 482, in main
config = _config.MavisConfig.read(args.config)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/config.py", line 379, in read
return MavisConfig(**config_dict)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/config.py", line 264, in __init__
raise err
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/config.py", line 257, in __init__
self[sec] = validate_section(kwargs.pop(sec, {}), defaults, True)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/config.py", line 466, in validate_section
value = cast_type(value)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/util.py", line 67, in filepath
raise TypeError('File does not exist', path)
TypeError: Error in validating the reference section in the config. File does not exist None ```
TypeError: Error in validating the reference section in the config. File does not exist None ```
This sounds like you didn't add a reference file you need? or you set it to None. What does the reference section of your mavis config look like?
Hi Cara,
(Un)Shockingly you are right; I didnt have the right masking etc for my ref. Changed Ref. THANKS
Still looking like chrm 8. Really sorry if I'm messing up something w/delly
(base) [sdada@gphost14 MAVIS]$ mavis setup file.cfg -o /projects/path/MAVIS
MAVIS: 2.2.9
hostname: gphost14.bcgsc.ca
[2021-04-30 11:09:34] arguments
command = 'setup'
config = '/projects/path/MAVIS/file.cfg'
log = None
log_level = 'INFO'
output = '/projects/path/MAVIS'
skip_stage = []
creating output directory: '/projects/path/MAVIS/converted_inputs'
setting up the directory structure for PG0003838 as /projects/path/MAVIS/PG0003838_normal_genome
converting input command: ['convert_tool_output', '/projects/path/DELLY_0.8.1/patientcompared_dellysomatic_v81_hg38header.vcf', 'delly', False]
reading: /projects/path/DELLY_0.8.1/patientcompared_dellysomatic_v81_hg38header.vcf
Traceback (most recent call last):
File "/home/sdada/miniconda3/bin/mavis", line 8, in <module>
sys.exit(main())
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/main.py", line 600, in main
raise err
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/main.py", line 582, in main
pipeline = _pipeline.Pipeline.build(config)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/schedule/pipeline.py", line 350, in build
libconf.inputs = run_conversion(config, libconf, conversion_dir)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/schedule/pipeline.py", line 81, in run_conversion
convert_tool_output(
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/tools/__init__.py", line 34, in convert_tool_output
_convert_tool_output(
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/tools/__init__.py", line 276, in _convert_tool_output
rows = read_vcf(input_file, file_type, log)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/tools/vcf.py", line 197, in convert_file
raise err
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/tools/vcf.py", line 194, in convert_file
rows.extend(convert_record(vcf_record, log=log))
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/tools/vcf.py", line 110, in convert_record
chr2, end, orient1, orient2, ref, alt = parse_bnd_alt(alt)
File "/home/sdada/miniconda3/lib/python3.8/site-packages/mavis/tools/vcf.py", line 74, in parse_bnd_alt
raise NotImplementedError('alt specification in unexpected format', alt)
NotImplementedError: ('alt specification in unexpected format', ']chr8:43240816]') ```
hello friends and fam
it seems like its an issue with my VCF (note bottom chr with 'N' in chromosome has a weird pattern and is missing a letter). going to re-bcf >vcf it, and if that wont work I am going to regenerate the delly bcf. the OTHER delly vcf works, which is a nice control. If none of this works i'll manually try and mess with it
(base) [sdada@gphost14 DELLY_0.8.1]$ grep -B 10 'chr8:43240816' file_dellysomatic_v81_hg38header.vcf
chrUn_KI270336v1 2 BND00064084 A [chrUn_KI270467v1:711[A . PASS IMPRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chrUn_KI270467v1;END=711;PE=11;MAPQ=23;CT=5to5;CIPOS=-51,51;CIEND=-51,51 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/1:-114.835,0,-264.594:10000:PASS:0:522:234:4:52:108:0:0 0/1:-120.162,0,-238.566:10000:PASS:0:1665:388:9:46:88:0:0
chrUn_KI270336v1 284 BND00064085 T T]chrUn_KI270467v1:2612] . LowQual IMPRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chrUn_KI270467v1;END=2612;PE=2;MAPQ=24;CT=3to3;CIPOS=-395,395;CIEND=-395,395 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/0:0,-1000,-1000:10000:PASS:472:759:25470:0:6944:385:0:0 0/1:-1000,0,-1000:10000:PASS:1314:1830:29355:0:3639:1227:0:0
chrUn_KI270336v1 315 DEL00064086 A <DEL> . LowQual IMPRECISE;SVTYPE=DEL;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chrUn_KI270336v1;END=831;PE=2;MAPQ=6;CT=3to5;CIPOS=-215,215;CIEND=-215,215 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/0:0,-47.4989,-383.219:10000:PASS:497:297:1161:0:175:0:0:0 0/0:0,-56.4709,-532.986:10000:PASS:1529:451:1669:0:203:1:0:0
chrUn_KI270336v1 440 BND00064087 C C]chrUn_KI270466v1:1097] . LowQual IMPRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chrUn_KI270466v1;END=1097;PE=3;MAPQ=21;CT=3to3;CIPOS=-50,50;CIEND=-50,50 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/1:-30.5308,0,-1000:10000:PASS:514:1810:4731:1:664:121:0:0 0/1:-796.708,0,-1000:10000:PASS:1586:3228:7851:1:1473:401:0:0
chrUn_KI270336v1 444 BND00064088 T T]chrUn_KI270467v1:1485] . LowQual IMPRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chrUn_KI270467v1;END=1485;PE=2;MAPQ=21;CT=3to3;CIPOS=-50,50;CIEND=-50,50 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/1:-103.653,0,-399.956:10000:PASS:514:1821:1128:2:143:116:0:0 0/1:-536.932,0,-484.266:10000:PASS:1609:3349:2367:2:205:424:0:0
chrUn_KI270336v1 781 BND00064089 G [chr3:93470364[G . LowQual PRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr3;END=93470364;PE=0;MAPQ=0;CT=5to5;CIPOS=-4,4;CIEND=-4,4;SRMAPQ=12;INSLEN=0;HOMLEN=4;SR=5;SRQ=0.952;CONSENSUS=TGATATTTTTTGTACAGTATAGAATATATACTTTGGGTATTTTGATATTTTATGTACAGTATACAATGTATGGTTTCTGAACTTTGATATTTCATGTAGAGTATAAAATATATATTTGGGGTACA;CE=1.73818 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 1/1:-1000,-199.238,0:10000:PASS:299:1496:158359:0:0:0:32:923 1/1:-1000,-65.3614,0:10000:PASS:568:2387:152387:0:0:0:19:346
chrUn_KI270336v1 861 BND00064090 T T]chr3:93470801] . PASS IMPRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr3;END=93470801;PE=37;MAPQ=27;CT=3to3;CIPOS=-50,50;CIEND=-50,50 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/1:-1000,0,-195.757:10000:PASS:447:1447:49:6:135:400:0:0 1/1:-1000,-524.548,0:10000:PASS:672:2094:54:6:175:3134:0:0
chrUn_KI270336v1 893 BND00064091 C C[chr4:51107269[ . LowQual IMPRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr4;END=51107269;PE=4;MAPQ=23;CT=3to5;CIPOS=-469,469;CIEND=-469,469 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/1:-81.404,0,-261.164:10000:PASS:646:1446:8:4:137:137:0:0 0/1:-42.8954,0,-334.227:10000:PASS:847:2086:21:5:187:80:0:0
chrUn_KI270336v1 909 BND00064092 A [chr3:93470362[A . LowQual PRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr3;END=93470362;PE=0;MAPQ=0;CT=5to5;CIPOS=-9,9;CIEND=-9,9;SRMAPQ=9;INSLEN=0;HOMLEN=8;SR=95;SRQ=0.970414;CONSENSUS=TACAGTATAGAATATATACCTTGGGTACTTTGATATTTTATGTACAGTATATAATATATGGTTTGTGAACTTTGATATTTCATGTAGAGTATAAAATATATATTTGGGGTACATTGATATTATATGTACAGTATATAATCTATATTTGATGTACTTTCATATTTTATGT;CE=1.73195 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 1/1:-1000,-1000,0:10000:PASS:849:1443:158359:0:0:0:521:12926 1/1:-1000,-1000,0:10000:PASS:991:2082:152386:0:0:0:423:7575
chrUn_KI270336v1 926 BND00064093 T T]chr3:93470799] . PASS PRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr3;END=93470799;PE=0;MAPQ=0;CT=3to3;CIPOS=-13,13;CIEND=-13,13;SRMAPQ=32;INSLEN=0;HOMLEN=12;SR=5;SRQ=0.968504;CONSENSUS=AAATATAGATTATATACTGTACATAAAATATCAAAGTACCCCAATATATATTATATACTGTACATGAAATATCAAAGTTCACAAACTATATATTATGTACTGTACATAAAATATCAAAGTACCCA;CE=1.72472 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 1/1:-1000,-1000,0:10000:PASS:1211:1442:49:2:0:0:175:13374 1/1:-1000,-1000,0:10000:PASS:1225:2072:54:3:0:0:183:7818
chrUn_KI270337v1 2 BND00064094 N ]chr8:43240816] . LowQual PRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.8.1;CHR2=chr8;END=43240816;PE=0;MAPQ=0;CT=5to3;CIPOS=-2,2;CIEND=-2,2;SRMAPQ=15;INSLEN=0;HOMLEN=1;SR=3;SRQ=1;CONSENSUS=ATTGTATACTGTACATAAAATATCAAAGTATCCAAAGTATGTATTATAAGCTGTAGATAAAATATCAAAGTACCCAAACTATATATTATATACTGTACATAAAATATGAAAGTACCCAAAGTAT;CE=1.76063 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/1:-755.564,0,-50.8002:10000:PASS:0:1700:167:20:0:0:64:298 0/1:-81.8446,0,-127.519:10000:PASS:0:5742:270:43:0:0:68:51 ```
ok, looks like this is an input issue rather than a bug so I am going to remove the bug label for now. If you are able to determine whether it is a problem with DELLY or BCFtools then you can reference this issue when you create one in the corresponding repo for the bug. In the meantime if we are
Describe the bug This bug shows the following error during setup, it might be an issue with parsing. I am running manta-1.4.0 , the data might be in the wrong format.
To Reproduce Steps to reproduce the behavior: .cfg file was made using the commands mavis config \ --library PG00038 genome normal False hg38_PG1.bam \ --library PG00037 genome normal False hg38_PG2.bam \ --library PG00048 genome diseased False hg38_PG3.bam \ --convert delly PG_dellysomatic_hg38.vcf delly \ --convert manta diploidSV.de_novo.vcf manta \ --assign PG1 delly manta \ --assign PG2 delly manta \ --assign PG3 delly manta \ -w PG0001_2_3.cfg
The problem
Expected behavior Consolidation of SV data
Versions (please complete the following information):
Additional context Add any other context about the problem here.