emmanuelparadis / pegas

Population and Evolutionary Genetics Analysis System
GNU General Public License v2.0
28 stars 10 forks source link

Updated vcf conflicts with cache #64

Closed jnrunge closed 2 years ago

jnrunge commented 2 years ago

Cool package, great work!

When a vcf is updated (i.e. same path/name, but changed content) and pegas has read it before (via read.vcf), it will or can cause issues (see below for my case). This is due to the cache no longer lining up with the actual file. The same file can be read in without issues when the file name is changed.

Reading 16500 / 26257 loci Error in start:end: NA/NaN argument Traceback:

  1. read.vcf(vcf, to = loci)

I suggest a filesum or modified time check be included in the cache system or an option to disable cache in the read.vcf function alongside an addition of error messages that broadly suggests for the user to try to disable cache to see if the issue persists.

I tried to look a bit whether there already is a solution, but did not find it, so hopefully this ticket is helpful.

pegas_1.0-1 with R 4.1.1

All the best to you!

emmanuelparadis commented 2 years ago

Hi, Thanks for the appreciation! A simple solution is to run VCFloci() on the modified VCF file. If your R code is part of a pipeline where the VCF file is updated by another program, then this may be quite simple.

jnrunge commented 2 years ago

Argh, yes, that works well for me! Thanks!

- Jan