Closed octopusCat88 closed 5 years ago
Hi Michaela (@octopusCat88 ),
I'm not able to reproduce this problem with VariantAnnotation 1.28.11 which is the most current version in release.
There was a bug in parsing long records which was fixed on Jan 18th:
commit 90b9deae85acddcf8eb8a0c0c2041b51ae7cf1f1
Author: vobencha <vobencha@gmail.com>
Date: Fri Jan 18 12:56:54 2019 -0800
Fix bug in buffer reallocation when a record fills the buffer exactly
See https://github.com/Bioconductor/VariantAnnotation/issues/19 for
details
I see you're using version 1.28.7 which is a version before the fix was applied so it's possible you're hitting this bug. Please update your version of VariantAnnotation and try again.
Thanks. Valerie
Testing with VariantAnnotation 1.28.11:
> vcf_no_anno <- readVcf("example_no_anno.vcf.gz")
> colData(vcf_no_anno)
DataFrame with 1 row and 1 column
Samples
<integer>
123Sample 1
> dim(vcf_no_anno)
[1] 1 1
> str(seqlevels(vcf_no_anno))
chr "chr1"
> vcf_vep_anno <- readVcf("example_vep_anno.vcf.gz")
> colData(vcf_vep_anno)
DataFrame with 1 row and 1 column
Samples
<integer>
123Sample 1
> dim(vcf_vep_anno)
[1] 1 1
> str(seqlevels(vcf_vep_anno))
chr "chr1"
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
...
other attached packages:
[1] VariantAnnotation_1.28.11 Rsamtools_1.34.1
[3] Biostrings_2.50.2 XVector_0.22.0
[5] SummarizedExperiment_1.12.0 DelayedArray_0.8.0
[7] BiocParallel_1.16.6 matrixStats_0.54.0
[9] Biobase_2.42.0 GenomicRanges_1.34.0
[11] GenomeInfoDb_1.18.2 IRanges_2.16.0
[13] S4Vectors_0.20.1 BiocGenerics_0.28.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 compiler_3.5.1 prettyunits_1.0.2
[4] GenomicFeatures_1.34.3 bitops_1.0-6 tools_3.5.1
[7] zlibbioc_1.28.0 progress_1.2.0 biomaRt_2.38.0
[10] digest_0.6.18 bit_1.1-14 BSgenome_1.50.0
[13] RSQLite_2.1.1 memoise_1.1.0 lattice_0.20-38
[16] pkgconfig_2.0.2 rlang_0.3.1 Matrix_1.2-14
[19] DBI_1.0.0 GenomeInfoDbData_1.2.0 rtracklayer_1.42.2
[22] httr_1.4.0 stringr_1.4.0 hms_0.4.2
[25] bit64_0.9-7 grid_3.5.1 R6_2.4.0
[28] AnnotationDbi_1.44.0 XML_3.98-1.18 magrittr_1.5
[31] blob_1.1.1 GenomicAlignments_1.18.1 assertthat_0.2.0
[34] stringi_1.3.1 RCurl_1.95-4.12 crayon_1.3.4
Dear @octopusCat88 and @vobencha,
thanks for checking this and pointing us to the updated version. And yes after updating the package the problem is solved. So I guess this bug is a duplicate of #19
Since this was solved by updating the package we can close this issue.
Thanks again and sorry for not checking before the new version.
Best, Christian
No problem. I'm glad the new version worked for you. Valerie
Hi,
I have been annotating VCF files with VEP.
VEP command on the command line
However after reading the annotated VCF file, some lines seem to be randomly split and parsed as a new line. In a minimal example with 1 variant, I end up with 2 entries in R, where the second one has half of the info column as chromosome names. Could this be a bug?
sessionInfo()
Please let me know, if you need more input to replicate this error.
Best, Michaela Müller