How to define high-confidence SNPs?

Hi, I now have 6 ovaries from cattle breed A and 1 sperm sample from breed B for whole genome resequencing and call snp using gatk (BWA comparison uses the NCBI reference genome of breed A), the process is

1) gatk HardFiltration

gatk VariantFiltration --variant WGS.raw.SNP.vcf.gz --filter-expression "QD < 2.0 || FS > 60.0 || MQ < 40.0 || SOR > 3.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0" --filter-name "snp_filter" --genotype-filter-expression "DP < 2 || DP > 
50" --genotype-filter-name "dp_fail" --output WGS.gatk_filter_zhu.snp.vcf.gz

2）Filter the filtered passes and keep the alleles

gatk SelectVariants --exclude-filtered true --restrict-alleles-to BIALLELIC --reference genomic.fna --variant WGS.gatk_filter_zhu.snp.vcf --output con.hardfiltered.snp.vcf

3) snp.vcf

According to the PASS in the FILTER in column 7, add FI information in column 9 (PASS is defined as 1, FAIL is defined as 0, and the header is modified to ##FORMAT=)

My aim is to perform allele expression analysis on RNA seq of hybrid embryos produced from breed A ovaries and breed B sperm, but I encountered two problems while constructing the N mask genome:

Problem 1: When running SNPsplit_genome_preparation --reference_genome reference --vcf snp.vcf --strain ovary1 --strain2 sperm --dual_hybrid the-- strain is set to any of the 6 ovaries of cattle breed A (which is what I'm using now), or is it set to all 6 ovary samples for a more complete reflection of breed A's variation (and how exactly should this be accomplished)

Question 2: What exactly are the two indicators (GT and FI) for SNPSplit to judge high-confidence SNPs (because all the snp loci FI in my filtered snp.vcf file are already 1, and as a result, there are still more than 400,000 loci that are low confidence being filtered out)

SNP position summary for strain sperm (based on genome build GRCm39)
============================================================
Positions read in total:        21702510

4615627 SNP were homozygous. Of these:
23227   SNP were homozygous and passed high confidence filters and were thus included into the D21013078 genome

Not included into D21013078 genome:
6098066 had the same sequence as the reference
0               had no clearly defined alternative base
10988817          Calls were neither 0/0 (same as reference) or 1/1, 2/2, 3/3 (homozygous SNP)
4592400         were homozygous but the filtering call was low confidence

Printed a single list of all SNPs to >all_SNPs_D21013078_GRCm39.txt.gz<...

my vcf file:

##fileformat=VCFv4.2
##ALT=<ID=NON_REF,Description="Represents any possible alternative allele not already represented at thi
##FILTER=<ID=LowQual,Description="Low quality">
##FILTER=<ID=dp_fail,Description="DP < 2 || DP > 50">
##FILTER=<ID=snp_filter,Description="QD < 2.0 || FS > 60.0 || MQ < 40.0 || SOR > 3.0 || MQRankSum < -12.
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the ord
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad
##FORMAT=<ID=FT,Number=.,Type=String,Description="Genotype-level filter">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=FI,Number=1,Type=Integer,Description="High confidence (1) or low confidence (0) based on so
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing ho
##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phasing set (typically the position of the first vari
##FORMAT=<ID=RGQ,Number=1,Type=Integer,Description="Unconditional reference genotype confidence, encoded
##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fi
##GATKCommandLine=<ID=CombineGVCFs,CommandLine="CombineGVCFs --output /storage2/anlei/zhaomengmeng/Bovin
##GATKCommandLine=<ID=SelectVariants,CommandLine="SelectVariants --output con.hardfiltered.snp.vcf --exc
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the 
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order 
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. 
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filter
##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess hete
##INFO=<ID=FS,Number=1,Type=Float,Description="Phred-scaled p-value using Fisher's exact test to detect 
##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the
##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele 
##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele fr
##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref
##INFO=<ID=QD,Number=1,Type=Float,Description="Variant Confidence/Quality by Depth">
##INFO=<ID=RAW_MQandDP,Number=2,Type=Integer,Description="Raw data (sum of squared MQ and total depth) f
##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs
##INFO=<ID=SOR,Number=1,Type=Float,Description="Symmetric Odds Ratio of 2x2 contingency table to detect 
##contig=<ID=NC_037328.1,length=158534110,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##contig=<ID=NC_037329.1,length=136231102,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##contig=<ID=NC_037330.1,length=121005158,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##contig=<ID=NC_037331.1,length=120000601,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##contig=<ID=NC_037332.1,length=120089316,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##contig=<ID=NW_020192292.1,length=9309904,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##contig=<ID=NW_020192293.1,length=27572,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##contig=<ID=NW_020192294.1,length=986051,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##contig=<ID=NC_006853.1,length=16338,assembly=GCF_002263795.2_ARS-UCD1.3_genomic.fna>
##reference=file:///storage2/anlei/zhaomengmeng/Bovine/reference/GCF_002263795.2_2018/GCF_002263795.2_ARS-UCD1.3_genomic.fna
##source=CombineGVCFs
##source=SelectVariants
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sperm       ovary1       ovary2      ovary3      ovary4       ovary5       ovary6
NC_037328.1     25776   .       A       C       52.69   PASS    AC=1;AF=0.100;AN=10;BaseQRankSum=0.00;DP=21;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=0.200;MQ=42.62;MQRankSum=-1.111e+00;QD=8.78;ReadPosRankSum=-4.310e-01;SOR=0.307     GT:AD:DP:FT:GQ:PL:FI    ./.:0,0:0:PASS:.:0,0,0:1        0/0NC_037328.1     27151   .       C       A       55.46   PASS    AC=1;AF=0.071;AN=14;BaseQRankSum=0.00;DP=41;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.071;MQ=60.00;MQRankSum=0.00;QD=9.24;ReadPosRankSum=0.00;SOR=0.693 GT:AD:DP:GQ:PL:FI       0/0:7,0:7:21:0,21,172:1 0/1:4,2:6:65:65,0,149:1 0/0NC_037328.1     27273   .       C       T       411.50  PASS    AC=6;AF=0.600;AN=10;BaseQRankSum=0.00;DP=25;ExcessHet=3.6798;FS=23.314;MLEAC=6;MLEAF=0.600;MQ=42.55;MQRankSum=0.00;QD=24.21;ReadPosRankSum=1.96;SOR=1.184       GT:AD:DP:GQ:PL:FI       ./.:0,0:0:.:0,0,0:1     ./.:1,0:1:.:0,0,0:1NC_037328.1     27605   .       T       A       209.09  PASS    AC=3;AF=0.250;AN=12;BaseQRankSum=0.00;DP=41;ExcessHet=4.3933;FS=2.199;MLEAC=4;MLEAF=0.333;MQ=45.19;MQRankSum=0.00;QD=11.00;ReadPosRankSum=-5.660e-01;SOR=1.565  GT:AD:DP:GQ:PL:FI       ./.:0,0:0:.:0,0,0:1     0/1:5,3:8:95:95,0,1NC_037328.1     27764   .       T       A       282.72  PASS    AC=2;AF=0.167;AN=12;BaseQRankSum=1.11;DP=66;ExcessHet=3.4242;FS=0.000;MLEAC=2;MLEAF=0.167;MQ=51.21;MQRankSum=-9.960e-01;QD=10.47;ReadPosRankSum=0.393;SOR=0.892 GT:AD:DP:GQ:PL:FI       ./.:0,0:0:.:0,0,0:1     0/0:9,0:9:27:0,27,3NC_037328.1     27916   .       A       C       168.46  PASS    AC=1;AF=0.083;AN=12;BaseQRankSum=-1.610e-01;DP=73;ExcessHet=3.0103;FS=4.468;MLEAC=2;MLEAF=0.167;MQ=43.56;MQRankSum=1.85;QD=6.74;ReadPosRankSum=-1.697e+00;SOR=2.066     GT:AD:DP:GQ:PGT:PID:PL:PS:FI    ./.:0,0:0:.:.:.:0,0,0:1 0/0`

FelixKrueger / SNPsplit