Closed antonylebechec closed 2 months ago
Parameters file 'param.samples_filter.json':
{
"samples": {
"list": ["sample1", "sample2"]
}
}
Command to export/convert into VCF file:
howard convert --input='tests/data/example.vcf.gz' --output='/tmp/example.filtered.vcf '--param='config/param.samples_filter.json'
As an example, VCF file contains allowed genotype formats (see example.with_allowed_genotypes.vcf
):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample2 sample3 sample4
chr1 28736 . A C 100 PASS CLNSIG=pathogenic GT:AD:DP:GQ 0/1:525,204:729:99 0/1:12659,4994:17664:99 1:2:3:4 0/1|2:401,175:576:99
chr1 35144 . A C 100 PASS CLNSIG=non-pathogenic GT:AD:DP:GQ ./. 0/1:12659,4994:17664:99 0:1:2:3 0/1:401,175:576:99
chr1 69101 . A G 100 PASS DP=50 GT:AD:DP:GQ 0/1:525,204:729:99 ./.:.:.:. .|. 0/1:401,175:576:99
chr1 768251 . A G 100 PASS . GT:AD:DP:GQ 0/1:525,204:729:99 ./.:.:.:. .:1:2:3 0/1:401,175:576:99
chr1 768252 . A G 100 PASS . GT:AD:DP:GQ 0/1:525,204:729:99 ./.:.:.:. ././. 0/1:401,175:576:99
chr1 768253 . A G 100 PASS . GT:AD:DP:GQ 0/1:525,204:729:99 ./. . 0/1:401,175:576:99
chr7 55249063 rs1050171 G A 5777 PASS DP=125 GT:AD:DP:GQ 0/1:525,204:729:99 0/1:12659,4994:17664:99 .|.:.:.:. 0/1:401,175:576:99
In order to manage sample column in input file, i.e. to check if column are well-formed based on 'FORMAT' column, or to force export of a list of column/sample (even if they are not well-formed), a parameter in param.json could be added. These parameters will be applied only for VCF format output files. Other formats can include extra columns not in VCF format.
Genotype well-formed format correspond to:
'^[0-9.]([/|][0-9.])*'
(GT start with whatever number of allele)GT:AD:DP:GQ
with values0/1:525,204:729:99
,./.:525,204:729:99
,.:525,204:729:99
,0|1:525,204:729:99
or0/1/2:525,204:729:99
)^[.]([/|][.])*$
(no genotype)