Closed wwang-chcn closed 3 years ago
Hi @wwang-chcn,
It appears the VCF file is somewhat reasonably formatted, so the SNPsplit approach might just work.
You will need to identify the line that splits the FORMAT
field according to the information from the field INFO
:
Since INFO
is GT:GQ:DP:AD:VAF:PL:FI
, I assume it could be something like this:
my ($gt,$gq,$dp,$ad,$vaf,$ad,$pl,$fi);
# INFO: GT:GQ:DP:AD:VAF:PL:FI
($gt,$gq,$dp,$ad,$vaf,$ad,$pl,$fi) = split/:/,$strain;
You might then also have to think about which genotypes you wany to tolerate, e.g. only homozygous variants such as 1/1, or whether or not you require the filter value ($fi
) to be 1. You should be able to read through the script and make the required adjustments.
Let me know if you encounter additional issues.
Cheers, Felix
Hi @wwang-chcn,
It appears the VCF file is somewhat reasonably formatted, so the SNPsplit approach might just work.
You will need to identify the line that splits the
FORMAT
field according to the information from the fieldINFO
:Since
INFO
isGT:GQ:DP:AD:VAF:PL:FI
, I assume it could be something like this:my ($gt,$gq,$dp,$ad,$vaf,$ad,$pl,$fi); # INFO: GT:GQ:DP:AD:VAF:PL:FI ($gt,$gq,$dp,$ad,$vaf,$ad,$pl,$fi) = split/:/,$strain;
You might then also have to think about which genotypes you wany to tolerate, e.g. only homozygous variants such as 1/1, or whether or not you require the filter value (
$fi
) to be 1. You should be able to read through the script and make the required adjustments.Let me know if you encounter additional issues.
Cheers, Felix
Thanks Felix,
I just find the key about this question and create a pull requst #48 that can handle variable order of INFO.
Yours, Wen
I have now tried to write a stable auto-detect version, which works for both the v5 and v7 Mouse Genomes Project files. Could you clone the current dev version and see if it also works with your fishes? Edited here: 5e605d86c28ba32b1b9b3388117a64e7ac397097
So I assume we can close this issue now?
Yes :)
I am experiencing processing error when using SNPsplit_genome_preparation.
SNPsplit Genome Preparation version: 0.4.0
command:
processing log:
vcf file
Could help me?