Bioconductor / VariantAnnotation

Annotation of Genetic Variants
https://bioconductor.org/packages/VariantAnnotation
23 stars 20 forks source link

Files do not exist error #56

Closed abalter closed 2 years ago

abalter commented 2 years ago

The VCF file I'm using is valid as far as other tools such as bcftools, gatk VariantsToTable, are concerned. However, I'm getting this error reading it in:

> list.files(path="large_files") %>% grep("*.vcf.gz", ., value = T)
[1] "161229_1634201813.vcf.gz" "tmp.vcf.gz"              
> fl <- system.file("extdata", "large_files/161229_1634201813.vcf.gz", package="VariantAnnotation")
> vcf <- readVcf(fl, "hg19")
Error in .io_check_exists(path(con)) : file(s) do not exist:
  ‘’

Suggestions?

hpages commented 2 years ago

list.files() gives you the list of files in the local directory that you specified (large_files), which I suppose is a subdirectory of your current working directory. But then you're trying to grab a file from the extdata folder of VariantAnnotation's installation folder!? Your file is obviously not there. It's in the large_files directory so either construct its path with file.path("extdata", "161229_1634201813.vcf.gz") or just ask list.files() to return full names in the first place:

list.files("large_files", pattern=".vcf.gz", full.names=TRUE)
[1] "large_files/161229_1634201813.vcf.gz" "large_files/tmp.vcf.gz"  

The man pages for list.files() and system.file() (?list.files and ?system.file, respectively) are worth a read.

If you need more help with this, it's better to ask on the Bioconductor support site. This is very basic stuff and GitHub issues are primarily for bug reports.

Thanks, H.

abalter commented 2 years ago

I understand what is happening. The whole system.file thing is a red herring. That's only because the file you are demonstrating is a test file in the package. vcf = readVcf("large_files/161229_1634201813.vcf") totally works.

I honestly think you might alter your documentation so as not to throw off anyone else. In a real situation you would never need to call system.file and yet that command is used heavily in the documentation.

hpages commented 2 years ago

Bioconductor vignettes assume that the reader is familiar with basic R concepts.

FWIW here are 2 useful things to do when you follow a vignette:

  1. Always check your variables. Did you check fl after doing fl <- system.file("extdata", "large_files/161229_1634201813.vcf.gz", package="VariantAnnotation")? fl is supposed to contain the path to your file but if you see that it's the empty string then obviously system.file() didn't do what you thought it was doing.

  2. Learn about the functions that are used in the vignette by checking their man page (?system.file() in this case).

Best, H.