Open Al-Murphy opened 2 years ago
Have you tried running your example, but preceding this with
debug(VariantAnnotation:::.makeVcfMatrix)
# note the :::
to get clues as to why the INFO data are lost? Please consider making a PR if you find the answer.
Hey Vince!
Tried to look into this but can't seem to install the 1.99.3+ version of Rhtslib on mac, have you come across this issue?
But from what I can tell without getting into the details of Variantannotation, when writeVcf is called on a VRanges object I believe it's the below function the one that's called from here:
setMethod(writeVcf, c("VCF", "connection"),
function(obj, filename, index = FALSE, ...)
{
if (!isTRUEorFALSE(index))
stop("'index' must be TRUE or FALSE")
if (!isOpen(filename)) {
open(filename)
on.exit(close(filename))
}
scon <- summary(filename)
headerNeeded <- !(file.exists(scon$description) &&
file.info(scon$description)$size !=0)
if (headerNeeded) {
hdr <- .makeVcfHeader(obj)
writeLines(hdr, filename)
}
if (index)
obj <- sort(obj)
if (all(is.na(idx <- .chunkIndex(dim(obj)[1L], ...))))
.makeVcfMatrix(filename, obj)
else
for (i in idx)
.makeVcfMatrix(filename, obj[i])
flush(filename)
if (index) {
filenameGZ <- bgzip(scon$description, overwrite = TRUE)
indexTabix(filenameGZ, format = "vcf")
unlink(scon$description)
invisible(filenameGZ)
} else {
invisible(scon$description)
}
})
Then the call VariantAnnotation:::.makeVcfMatrix uses the line INFO <- .makeVcfInfo(info(obj), length(rd))
to get the info field where the function info()
is defined as:
setMethod("info", "VCFHeader",
function(x)
{
info <- slot(x, "header")$INFO
if (is.null(info))
info <- DataFrame(Number=integer(), Type=character(),
Description=character())
info
})
The issue is that VRanges don't have a slot(x, "header") value so I don't think it would ever pick it up? Unless I misinterpreted the function calls here?
Wait I figured a way around it:
VariantAnnotation::writeVcf(
obj = VariantAnnotation::asVCF(vr,info='RSID'),
### Must supply filename without compression suffix
filename ="~/Downloads/test_vcf.vcf",
index = FALSE
)
Just converting to VCF class yourself and specifying the column to go into info in the writeVcf call works
Thank you for persevering. If you see a documentation enhancement that will prevent other users from hitting your problem please make a PR. I think we should leave this open for now.
Hi ,
I'm the creator of the MungeSumstats Bioconductor package in which we use VariantAnnotation::writeVcf as an output options for the formatted sumstats. I want to align the formatting of our sumstats to IEU OpenGWAS VCF, part of which means adding the RSIDs to the INFO column in the VCF. However, I can't seem to populate INFO with VariantAnnotation::writeVcf. Is there any way to do this?
An example: