Simon-Coetzee / motifBreakR

A Package For Predicting The Disruptiveness Of Single Nucleotide Polymorphisms On Transcription Factor Binding Sites.
27 stars 12 forks source link

VCF Import Fails With Unclear Error #26

Open DarioS opened 4 years ago

DarioS commented 4 years ago
> MBRobject <- variants.from.file(file = tempVCF, search.genome = genome, format = "vcf")
Error in open.TabixFile(VcfFile(file)) : 'indexname' must be character(1)

If the VCF file is only 300 Kb on disk, should indexing really be necessary?

I thought I could avoid the error by indexing, but that fails, too.

VariantAnnotation::writeVcf(variants, filename = tempVCF, index = TRUE) # Creates BGzip files automatically if indexing.
MBRobject <- variants.from.file(file = paste(tempVCF, "bgz", sep = '.'), search.genome = genome, format = "vcf")
Error in getListElement(x, i, ...) : 
  GRanges objects don't support [[, as.list(), lapply(), or unlist() at the moment

The line of code in snps.from.file causing the error is:

complex.variants <- vcf_ranges[nchar(vcf_ranges$REF) >  1 | nchar(vcf_ranges$ALT) > 1]

There can be multiple ALT alleles per record, so ALT can be a DNAStringSet and the logical comparison made in the function is wrong

Browse[4]> class(vcf_ranges$REF)
[1] "character"
Browse[4]> class(vcf_ranges$ALT)
[1] "DNAStringSetList"