twolinin / longphase

GNU General Public License v3.0
99 stars 9 forks source link

No FORMAT/PS description in header in SV output VCF #27

Closed tgong1 closed 1 year ago

tgong1 commented 1 year ago

Hi,

I found there is no line for PS tag in header of SV output VCF, while SNP output VCF has the line as below:

FORMAT=

This cause many problems in my downstream steps. Do you have any idea why it happens? I have tried both version 1.2 and 1.4. Any suggestion or idea will be appreciated!

Thank you for your time and help, Tingting

twolinin commented 1 year ago

Hi @tgong1

Can you help me check if there are any results in the SV output VCF using the following commands? I suspect the issue might be related to avoiding duplicate header definitions (https://github.com/twolinin/longphase/blob/v1.4/ParsingBam.cpp#L734)

grep "#" your_SV.vcf | grep "ID=PS"

Thanks

tgong1 commented 1 year ago

Hi,

Thank you for the quick reply. Here is the result by running the command, and I believe this is the tag from the SV merging tool SURVIVOR, which we have used.

FORMAT=

I can remove this tag in the SV input VCF, if it can help.....

Thank you, Tingting

twolinin commented 1 year ago

Hi @tgong1

This problem will be addressed in the next version update. At the moment, you can manually add the following PS definition to the header.

##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phase set identifier">

Thanks