twolinin / longphase

GNU General Public License v3.0
99 stars 9 forks source link

Support for other SNPcaller #5

Closed tuannguyen8390 closed 2 years ago

tuannguyen8390 commented 2 years ago

Hi dev team,

Not an issue so to speak, I am able to run LongPhase on my cluster. But will it support VCF from other SNPcaller, for example, Clair3 ? Does any special flag/format I need to be aware of?

Many thanks,

Tuan

ythuang0522 commented 2 years ago

@tuannguyen8390 We ever tested only a few other callers (e.g., longshot) but not Clair3, but there should be no problems as Clair follows VCF v4.2 spec.

tuannguyen8390 commented 2 years ago

Many thanks for the reply, I will try that and will get back if I run into any problem.

ythuang0522 commented 2 years ago

I am closing this issue as others told me it works with Clair3.

tuannguyen8390 commented 2 years ago

Yes, I can confirm that it does work with Clair3. Many thanks

btrainee commented 2 years ago

@ythuang0522 请问,longshot call变异的结果需要进行特殊处理吗? 我目前使用longshot的结果可以进行phasing但是当haplotag时,不知为何没有任何输出。 phased SNP file: phasing_result.vcf phased SV file: input bam file: ../../1.alignment/pacbioccs.minimap.bam output bam file: haplotag_snponly.bam number of threads: 12 write log file: true log file: haplotag_snponly.out

filter mapping quality below: 20 percentage threshold: 0.6 tag supplementary: false

parsing SNP VCF ... 11s tag read start ... tag read 0s

total process time: 11s total alignment: 0 total supplementary: 0 total secondary: 0 total unmapped: 0 total tag alignment: 0 total untagged: 0

twolinin commented 2 years ago

Hi @btrainee

Can you please provide a small part of the longshot vcf and phasing_result.vcf ?

Thanks

btrainee commented 2 years ago

hi, @twolinin : here is the first 5 lines of longshot vcf zgrep -v '^##' longshot.vcf.gz|head -5

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE

ptg000001l 3967 . T C 70.34 dn DP=7;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.008;AQ=35.80;GM=1;DA=7;MQ10=1.00;MQ20=1.00;MQ30=1.00;MQ40=1.00;MQ50=1.00;PH=70.34,0.00,250.80,142.68;SC=ACCCTAACCCTAACCCTAACC; GT:GQ:PS:UG:UQ 0|1:70.34:3967:0/1:52.29 ptg000001l 3979 . T C 70.35 dn DP=7;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.008;AQ=35.80;GM=1;DA=7;MQ10=1.00;MQ20=1.00;MQ30=1.00;MQ40=1.00;MQ50=1.00;PH=70.35,0.00,250.80,142.68;SC=ACCCTAACCCTAACCCTAACC; GT:GQ:PS:UG:UQ 0|1:70.35:3967:0/1:52.29 ptg000001l 4068 . G A 70.38 dn DP=7;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.008;AQ=36.07;GM=1;DA=7;MQ10=1.00;MQ20=1.00;MQ30=1.00;MQ40=1.00;MQ50=1.00;PH=70.38,0.00,252.47,144.31;SC=CCTAAACCCTGACCCTGACCC; GT:GQ:PS:UG:UQ 0|1:70.38:3967:0/1:52.32 ptg000001l 4074 . G A 68.33 dn DP=7;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.008;AQ=35.76;GM=1;DA=7;MQ10=1.00;MQ20=1.00;MQ30=1.00;MQ40=1.00;MQ50=1.00;PH=68.33,0.00,250.41,144.31;SC=CCCTGACCCTGACCCTGACCC; GT:GQ:PS:UG:UQ 0|1:68.33:3967:0/1:50.27

here is the first 5 lines of phasing_result.vcf

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE

ptg000001l 3967 . T C 70.34 dn DP=7;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.008;AQ=35.80;GM=1;DA=7;MQ10=1.00;MQ20=1.00;MQ30=1.00;MQ40=1.00;MQ50=1.00;PH=70.34,0.00,250.80,142.68;SC=ACCCTAACCCTAACCCTAACC; GT:GQ:UG:UQ:PS 0|1:70.34:0/1:52.29:3967 ptg000001l 3979 . T C 70.35 dn DP=7;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.008;AQ=35.80;GM=1;DA=7;MQ10=1.00;MQ20=1.00;MQ30=1.00;MQ40=1.00;MQ50=1.00;PH=70.35,0.00,250.80,142.68;SC=ACCCTAACCCTAACCCTAACC; GT:GQ:UG:UQ:PS 0|1:70.35:0/1:52.29:3967 ptg000001l 4068 . G A 70.38 dn DP=7;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.008;AQ=36.07;GM=1;DA=7;MQ10=1.00;MQ20=1.00;MQ30=1.00;MQ40=1.00;MQ50=1.00;PH=70.38,0.00,252.47,144.31;SC=CCTAAACCCTGACCCTGACCC; GT:GQ:UG:UQ:PS 0|1:70.38:0/1:52.32:3967 ptg000001l 4074 . G A 68.33 dn DP=7;AC=4,3;AM=0;MC=0;MF=0.000;MB=0.008;AQ=35.76;GM=1;DA=7;MQ10=1.00;MQ20=1.00;MQ30=1.00;MQ40=1.00;MQ50=1.00;PH=68.33,0.00,250.41,144.31;SC=CCCTGACCCTGACCCTGACCC; GT:GQ:UG:UQ:PS 0|1:68.33:0/1:50.27:3967

Many thanks!

twolinin commented 2 years ago

Hi @btrainee

Can you help me check if the VCF header contains contig or chromosome information? For example below.

##contig=<ID=1,length=249250621>
##contig=<ID=2,length=243199373>
##contig=<ID=3,length=198022430>

Thanks

btrainee commented 2 years ago

Hi @twolinin There is no contig or chromosome information in the VCF header . So , is this the reason for that problem?
I'll try to add this infomation , Many thanks !

btrainee commented 2 years ago

Hi @twolinin I have tried,it worked well , Thank you very much!