Closed biona001 closed 3 years ago
Thanks for the tip of using ref_allele_dosage!
. Now this works:
using BGEN, VCFTools
function convert_gt(t::Type{T}, b::Bgen) where T <: Real
n = n_samples(b)
p = n_variants(b)
G = Matrix{t}(undef, p, n)
# loop over each variant
i = 1
for v in iterator(b; from_bgen_start=true)
dose = ref_allele_dosage!(b, v; T=t) # this reads REF allele as 1
BGEN.alt_dosage!(dose, v.genotypes.preamble) # switch 2 and 0 (ie treat ALT as 1)
copyto!(@view(G[i, :]), dose)
i += 1
end
return G
end
Gtest = convert_gt(Float64, Bgen("target.typedOnly.masked.bgen"))
Gtrue = VCFTools.convert_gt(Float64, "target.typedOnly.masked.vcf.gz", trans=true)
julia> all(skipmissing(Gtrue .== Gtest))
true
Here is a test data (unphased VCF file converted to BGEN using qctool v2) Archive 2.zip
I want to import all BGEN genotypes into numeric matrix, so I wrote the following function
But imported genotype matrix does not agree with VCF file:
This seems to be because BGEN is checking which allele is the minor allele (so a 2 is swapped with 0 and 0 is swapped with 2 compared to VCF file)
Is there a way to instead read all ALT alleles as 1 and all REF allele as 0?